Author: M.Nordhoff   (27 Aug 06 5:32pm)
A feature that would be useful to me right now is to search the database by the User-Agent used by crawlers. That way I could e.g. search for "psycheclone" to see all of the different IP addresses it uses.
 Re: Search by User-Agent
Author: M.Prince   (27 Aug 06 10:05pm)
It just so happens that we have kind of just implemented that, although it may not be exactly how you want it. If you go to an IP Inspector page for a certain harvester there will be a list of all the useragents that the harvester uses. For example, since you mentioned "psycheclone," check out:

While they don't look like links, if you click on any of the useragents listed there it'll take you to the Harvesters page with a list of everyone who has used that useragent. Here's a link to "psycheclone":

Coming soon, you'll be able to click on the useragents listed on the top spam harvester useragents and also go to a list as well:

We may implement something where you can lookup useragents through an actual query. However, this creates some challenges (albeit overcome-able) in terms of caching those pages and making sure the site still responds quickly and the database doesn't get hammered. For now, hopefully, the ability to do lookups through the links gets you most of what you need.

 Re: Search by User-Agent
Author: M.Nordhoff   (28 Aug 06 2:27pm)
Ooh! Secret links! Cool! That's good enough for what I was trying to do.

Edit: Or not. I clicked on's "Crawler Mozilla/4.0( compatible; MSIE 6.0; Windows NT 5.1; SV1; Maxthon; Alexa Toolbar)" user-agent ( -- it's a very long URL) and I just got "We have not yet identified any harvesters active for the criteria you have selected.". Does the user-agent search thing not include robots that aren't yet confirmed harvesters?

Post Edited (28 Aug 06 2:34pm)
 Re: Search by User-Agent
Author: M.Prince   (28 Aug 06 3:24pm)
Right now it only works to find other confirmed harvesters that have the same UserAgent. The fact that none came up means that we haven't found any others with the same thing. We'll think about how to make it so you can search as-of-yet unidentified IPs.

