所有软件外包项目 Gray arrow bg Data Mining

Data Mining 资金已经托管 线上项目,线下洽谈,智城安排

发包方 : Helen rodriguez 接包方 : Iphone_lancers 状态 :完成
项目编号 : 100008
项目预算 : $1,000-5,000
开发周期 : 7 天
技能 : PHP
发布日期 : 2010-04-03

描述

I need a script to mine/scrape a website and extract certain data and put it into an Excel spreadsheet for mail mergers or emailing to a mail house for mail fulfillment.

The collection process is:

Visit http://www.courtindex.com  
> Select “Member Login” 
> Login (enter user name and password, will be given to selected person) 
> Under Common Please, select today’s date (this site only publishes Monday – Friday).
 
The next screen on the website will display a list of common plea cases.
The following is an example:

A-1002397 FORECLOSURE - U.S. Bank, N.A., 800 Moreland, Owensboro, KY 42304 vs. J. Calvin & Nancy G. Daugherty, 11443 Kenn Rd., 45240, et al. Foreclosure, money in the sum of $88,295.43, plus int. Lori N. Wight/80789, of Lerner, Sampson & Rothfuss, Attys.

The above bold, items are what I need to scrape/mine.  The case number, the defendants name and address.  The first name and last name must be separated.  The first name in the above example would be “Calvin & Nancy”.  The middle initial is not needed.

This website site is for a Cincinnati, OH based, court paper.  The paper does not put the city name or state if the address is a Cincinnati, OH mailing address.   The script will need to default to Cincinnati, OH or use something to cross-reference the postal code.

The Excel file will include the following columns:
First Name, Last Name, Address, City, State, Postal Code, Case Number

I want this to be a web-hosted process.

The long-term plan is to mine/scrap multi-sites and place in a hosted DB with a member login interface where data can be pulled by State, County, Zip Code and etc.

We need this completed in seven-business day from the date of candidate selection.

Candidate must have strong written and spoken English skills.

THANKS!

竞标

请您先登录,然后提交此项目的竞标方案。
还不是智城用户? 智城期待您的加入,请注册成为我们的一员吧!
Project ad2