Message Board

Newbie/Basic Questions

Older Posts ]   [ Newer Posts ]
 How Do Bots Sift Through a Web Site?
Author: J.Yard2   (22 Apr 07 12:26am)
How Do Bots Sift Through a Web Site?

Do they just start at the main page and follow ALL the links?
Or are they more selective than that?
How do the select which links to follow?
Do they typically go through the entire site all at one time or do they visit a few pages, move on to a another web site then come back later and visit a few more pages, etc.?
 
 Re: How Do Bots Sift Through a Web Site?
Author: M.Prince   (23 Apr 07 4:26pm)
It depends on the robot. Generally, the robots that spammers use are pretty dumb. We see some harvesters hit the same honey pot over and over throughout the course of a week. The rare spam harvester is extremely smart. It will avoid pages with spam trappy words (e.g., "harvester" "spam trap" "poisond" etc...). Even more sneaky, it will load the same page twice in a short amount of time and often from two different IPs. If the page changes then the harvester assumes it's a trap and doesn't pick up the addresses. Neat stuff. Even neater that we take it into account and don't fall for their anti-anti-spam tricks.
 
 Re: How Do Bots Sift Through a Web Site?
Author: J.Yard2   (23 Apr 07 9:29pm)
"Even more sneaky, it will load the same page twice in a short amount of time and often from two different IPs. If the page changes then the harvester assumes it's a trap and doesn't pick up the addresses."

Forcing harvesters to avoid dynamic pages would be a big win for us. No big deal for us to make all our pages dynamic.

I must be getting visited by the smart ones because they are doing an impressive job of avoiding my honey pot. Only about 10% of front pages visits go beyond to another page, and only one has hit the honey pot in nearly a week.

<a href="a link"><img border="0" src="a transparent image.gif"></a>

Post Edited (23 Apr 07 9:43pm)
 
 Re: How Do Bots Sift Through a Web Site?
Author: M.Prince   (23 Apr 07 10:44pm)
It takes time before you catch anything. And, remember, not only does it take time to hit the honey pot pages, but it also takes time between when a page is hit and when a message is sent to an email address distributed there. Sometimes more than a year. Give it time. The harvesters and other bad bots will come.

Matthew.



do not follow this link

Privacy Policy | Terms of Use | About Project Honey Pot | FAQ | Cloudflare Site Protection | Contact Us

Copyright © 2004–25, Unspam Technologies, Inc. All rights reserved.

contact | wiki | email