所有软件外包项目 Gray arrow bg Site Finder

Site Finder 资金已经托管 线上项目,线下洽谈,智城安排

发包方 : Mary anthony 状态 :竞标已结束
项目编号 : 94585
项目预算 : $1,000-5,000
开发周期 : 7 天
技能 : Search C
发布日期 : 2010-01-03

描述

A desktop app that searches for websites in the main search engines, then follows url to that site.


We presently use a program in C++, Visual C++ (All source files available). The programmer is no longer alive. It works well but needs small mods and better memory management. You may start from scratch or use the original source files to modify.



The main core app;


Reading from editable (csv) lists, the app visits the first listed search engine, searching for a listed keyword for a given number of pages deep, matching against the listed url. If a match is found it will open the url link listed by the search engine in the embedded browser (unless its a sponsoring link).


Random delays are set between searches and page visits as search engine will block a 'non human' or 'bot' action.


Hence the need to simulate the entire function as human as possible.



Taking an example from existing app, the editable files could be similar to;


Search engines;


" http://www.bing.com/search?q= ","&first=","",""


" http://uk.search.yahoo.com/search?p= ","&b=","",""


" http://uk.altavista.com/web/results?q= ","&stq=","",""



Urls;


mysite.com ,articles/mypage.html,keyword,


anothersite.biz ,,keyword,



In the above example, it would first search Bing XX pages deep for the mysite.com 'keyword'. If mysite.com is found, it will follow the url then after xx seconds, follow to 'articles/mypage.html' on that site. After remaining there for XX seconds, the next search to Yahoo is commenced for the same site and keyword, followed by Altavista. Cookies & cache cleared. Then starts again with 'bing' searching the given 'keyword' for ' anothersite.biz . And so on..... If the site url is not found for any keyword after xx searches, then the whole url ( ' mysite.com ') is searched for. If still not found, it moves on to next search. It could add any totally failed searches to a recycling 'failed' text file (limit size 50k to preseve memory?).


NOTE** In ' anothersite.biz ' the app would follow any page on that site at random as no variable given.


The search engine and url lists will never be large (no more than 10 items long).


You will need to check syntax used by the main search engines, though the ones above are correct.


We would manually edit the search engine and url csv files, following the apps required syntax.



The control panel would have 'start' 'pause' 'stop' buttons. Place to enter options.


1. Random delay between xx seconds for search engine deep page queries. (this setting is also used to determine how long it stays on the sites first page before continuing to the second page)


2. Random and fixed delay option of xxx seconds to stay on second page before starting new search.



If overcoming bocked IP's for persistent requests is a problem, we can use HideMyIp, it auto changes IP every XXX minutes etc. Though the present app works happily on same daily change IP without banning for 2 years. So the app would need to sample the present IP in use and adopt it automatically.


The internet connection can sometimes break for a couple of minutes, so the app should be stable and continue.


It would be useful to spoof the user agent (Though Miming different UserAgents may be more difficult using embedded browser).



If starting from scratch you may want to consider a C++ desktop app with an embedded webbrowser or EmbeddedWeb or Watin browser automation?


I will be advised by your expert choice.



There are possible additions to this app, though first I need to find a writer who can accomplish the above app.


竞标

请您先登录,然后提交此项目的竞标方案。
还不是智城用户? 智城期待您的加入,请注册成为我们的一员吧!
Project ad2