所有软件外包项目 Gray arrow bg Scraping/Mining data from 5 Public Websites

Scraping/Mining data from 5 Public Websites 资金已经托管 线上项目,线下洽谈,智城安排

发包方 : Barbara burrows 接包方 : Itgenes 状态 :完成
项目编号 : 100499
项目预算 : $1,500
开发周期 : 7 天
技能 : CSS XML
发布日期 : 2010-04-15

描述

The scope of this task involves one particular section scraping of 5 Public websites which allow such via robots.txt. Some details:

-The section of each website can be determined via a keyword in the URL.

-The # of documents varies per site but on average it is 10K.

-Some structured data will need to be extracted from each page, such as URL, page title and other information within HTML or CSS tags. Approximately 9 attributes will be extracted.

-Data should be delivered as CSV or XML file in previously agreed upon format.

-If task gets completed in a high quality manner then weekly or bi-weekly refreshes can be negotiated for an additional cost.

Please contact me if you have any questions or concerns. I look forward to working with you.

竞标

请您先登录,然后提交此项目的竞标方案。
还不是智城用户? 智城期待您的加入,请注册成为我们的一员吧!
Project ad2