所有软件外包项目 Gray arrow bg Crawler for prosurf.sbb.ch

Crawler for prosurf.sbb.ch 资金已经托管 线上项目,线下洽谈,智城安排

发包方 : Nancy davis 接包方 : Iprog 状态 :完成
项目编号 : 94330
项目预算 : $1,000
开发周期 : 7 天
技能 : PHP XML
发布日期 : 2009-12-28

描述

i would like to have the data from this website: http://prosurf.sbb.ch/pros/inter/mainsite_e.html as xml. This website displays for almost every Swiss trainstation the exact arrival and departime times of trains. 
The website lists the trainstations on alphabet, so http://prosurf.sbb.ch/pros/inter/prosurfservlet?TRANSACTION=093&ENTRYPAGE=A shows all trainstations beginning with A, http://prosurf.sbb.ch/pros/inter/prosurfservlet?TRANSACTION=093&ENTRYPAGE=B shows it with B, etc etc.

For each individual trainstation (e.g. Lausanne, http://prosurf.sbb.ch/pros/inter/prosurfservlet?TRANSACTION=004&LANGUAGE=e&a mp;PBP=LS&DIRECTION=2 ) the page shows the arrivaltimes (http://prosurf.sbb.ch/pros/inter/prosurfservlet?transaction=004&language=e& amp;pbp=LS&direction=1) and on a second page the departure times (http://prosurf.sbb.ch/pros/inter/prosurfservlet?transaction=004&language=e& amp;pbp=LS&direction=2)

On each page, the individual trains are linked (trainnumbers). When clicking on such a trainnumber, you will see the full details of this trainline, including (in the field "current") the expected delays.

I would like to have as endresult one large XML-file OR 26 XML-files (for each letter of the alphabet, one) which contains:

the trainstations

per station, the trainnumber

per trainnumber, the data: type of train, start station, final destination, expected arrival and departure time, actual arrival and departure time, delay-time departure and delay-time arrival.
 
This "crawler" should update its information every 5 minutes so i have a real uptodate newsfeed.

BE AWARE: this website has some built-in mechanisms to prevent leeching (max. pages per minutes allowed, ip-tracking possibly) so for this a solution should be found as well (proxy-usage, caching, etc.)?

竞标

请您先登录,然后提交此项目的竞标方案。
还不是智城用户? 智城期待您的加入,请注册成为我们的一员吧!
Project ad2