User Control Panel
Advertisements

HELP US, HELP YOU!

searchbot

 
Post new topic   Reply to topic    Bot Depot Forum Index -> Services
View unanswered posts
Author Message
searchwire
Newbie
Newbie


Joined: 05 Oct 2005
Posts: 4

Reputation: 7.2Reputation: 7.2Reputation: 7.2Reputation: 7.2Reputation: 7.2Reputation: 7.2Reputation: 7.2

PostPosted: Wed Oct 05, 2005 9:08 pm    Post subject: searchbot Reply with quote

Hi Faiz,

My names Mike and I was hoping one of you guys may be able to offer advise or indeed help.
I have a seach engine called Searchwire, its written in PHP and basically what I am trying to do is to find a developer who can create for me a Bot which will act in a similar way to Googlebot.

Trawl the net, add links to my site automatically etc. I have searched high and low on the net and no joy. Could you offer advise or do you know who I should contact on this?

Your thoughts would be greatly appreciated:))

Sincere Regards
Mike
Back to top
darkmonkey
The Merovingian
The Merovingian


Joined: 18 Apr 2004
Posts: 2557
Location: London, England
Reputation: 39.3Reputation: 39.3Reputation: 39.3Reputation: 39.3
votes: 7

PostPosted: Thu Oct 06, 2005 7:08 am    Post subject: Reply with quote

I have made one, in Perl, which trawls the web for links, which it then queues, and searchs those pages.

It is however very inefficient (it takes loads of memory) and won't find non-xhtml links as well (ie, the regex is that that it only gets href="*?", rather than all variations.

I would be happy to give it to you, and to forward your enquiry to someone i know who has done the regexes, etc. Both versions are in Perl, but can be easily converted =].

Quote:
add links to my site automatically etc


It'll be easy(ish) to add the links to a database, but the problem is that the spider can't distinguish "good" links from "bad" links, which will give you thousands (my last 20 minute trawl indexed around 20,000 pages, most of which have 20+ links) of results, which will be useless. For instance, imagine a spider getting caught on this board - it'd have thousands of links, mostly the same, to go through.

Hope this is of some help =]

_________________
~ Josh
[ Need bot hosting on a dedicated server? PM me. ]
Back to top
searchwire
Newbie
Newbie


Joined: 05 Oct 2005
Posts: 4

Reputation: 7.2Reputation: 7.2Reputation: 7.2Reputation: 7.2Reputation: 7.2Reputation: 7.2Reputation: 7.2

PostPosted: Thu Oct 06, 2005 8:00 am    Post subject: Reply with quote

darkmonkey wrote:
I have made one, in Perl, which trawls the web for links, which it then queues, and searchs those pages.

It is however very inefficient (it takes loads of memory) and won't find non-xhtml links as well (ie, the regex is that that it only gets href="*?", rather than all variations.

I would be happy to give it to you, and to forward your enquiry to someone i know who has done the regexes, etc. Both versions are in Perl, but can be easily converted =].

Quote:
add links to my site automatically etc


It'll be easy(ish) to add the links to a database, but the problem is that the spider can't distinguish "good" links from "bad" links, which will give you thousands (my last 20 minute trawl indexed around 20,000 pages, most of which have 20+ links) of results, which will be useless. For instance, imagine a spider getting caught on this board - it'd have thousands of links, mostly the same, to go through.

Hope this is of some help =]


Hi Smile

Thats excellent:) could i download this please to take a look at it,

Bestest
Mike
Back to top
searchwire
Newbie
Newbie


Joined: 05 Oct 2005
Posts: 4

Reputation: 7.2Reputation: 7.2Reputation: 7.2Reputation: 7.2Reputation: 7.2Reputation: 7.2Reputation: 7.2

PostPosted: Thu Oct 06, 2005 8:20 am    Post subject: Bot Reply with quote

Hi Josh:)
Downloaded the progiee thats cool. So, If I may just give you a little more info. My site is in PHP coded for me by another and basically my understanding of HTML is 100% but Perl and PHP its 0%.

Is it possible you could contact this dude who can make the bot more intelligent ? Here is the idea I have:

Catergory: Business
Lots of sub cats though, however the bot could have trawl instructions perhaps to trawl for specifics eg. Business> Accountants and return a search based on this.

Ok so ideally

Bot trawls the net under the trawl term of Business> Sub Cat and perhaps reads the meta tags of sites , title etc (robots.txt)

Then bot reports home with information and dumps it into a database or whatever which is then uploaded to the site.

My site has Perl, PHP etc on the hosting.

Hope you understand all this mate, Smile

Finally, for your help to make this Bot more intelligent, I would pay say £100 if thats ok:)

Really appreciate your help Josh

Best
Mike
Back to top
darkmonkey
The Merovingian
The Merovingian


Joined: 18 Apr 2004
Posts: 2557
Location: London, England
Reputation: 39.3Reputation: 39.3Reputation: 39.3Reputation: 39.3
votes: 7

PostPosted: Thu Oct 06, 2005 9:06 pm    Post subject: Reply with quote

Please add me on MSN to discuss further. I have PM'd you my MSN email.

Using meta tags will be difficult, as it will miss certain things, as..how do you define "business"? You would have to list business types, and professions...and then you would miss some. It's complicated. MSN me to talk about it =].

Quote:
Downloaded the progiee thats cool


If you mean the one in my signature, that's not it Razz. That's a bot for MSN - not what you want. this is what you want Razz.

_________________
~ Josh
[ Need bot hosting on a dedicated server? PM me. ]
Back to top
searchwire
Newbie
Newbie


Joined: 05 Oct 2005
Posts: 4

Reputation: 7.2Reputation: 7.2Reputation: 7.2Reputation: 7.2Reputation: 7.2Reputation: 7.2Reputation: 7.2

PostPosted: Thu Oct 06, 2005 10:37 pm    Post subject: Spider Reply with quote

Hi Josh,
Yes that the progiee:) when are u next online dude:)

Mike
Back to top
Display posts from previous:   
Post new topic   Reply to topic    Bot Depot Forum Index -> Services All times are GMT
Page 1 of 1

 



Protected by phpBB Security phpBB-TweakS
phpBB Security Has Blocked 9 Exploit Attempts.
Antispam Captcha Mod by phpbb-security.com
Powered by phpBB © 2001, 2005 phpBB Group