Author: H.Nienhuys (19 Aug 06 8:34am)
I just installed the honeypot software on my site, after noticing that a spambot from 208.66.195.xx downloaded about 1.5 million nonexistant email addresses from my spidertrap.
I noticed that the script gives the following headers:
<meta name="robots" content="follow">
<meta name="robots" content="noarchive">
<meta name="robots" content="noindex">
<meta name="no-email-collection" content="/">
Questions:
1. Is this meaningful? I suspect that with multiple meta robots tags only the last one counts. See http://www.robotstxt.org/wc/meta-user.html
2. Does the 4th tag ("no-email-collection") have any formal status? It seems to me that it is an initiative by PHPot or Unspam.com. The content="/" makes no sense to me either, there are no ToS there.
P.S. These are the stats from the harvester:
IP pages hits bandwidth time of last visit
208.66.195.3 19296 19296 363.55 MB 17 Aug 2006 - 05:04
208.66.195.6 13044 13044 242.91 MB 16 Aug 2006 - 10:23
208.66.195.2 5918 5918 109.03 MB 14 Aug 2006 - 01:09
208.66.195.5 5429 5429 102.00 MB 17 Aug 2006 - 00:15
208.66.195.10 4031 4031 74.38 MB 05 Aug 2006 - 04:44
208.66.195.21 2907 2907 52.15 MB 13 Aug 2006 - 02:00
208.66.195.11 1902 1902 35.68 MB 03 Aug 2006 - 22:41
208.66.195.19 1414 1414 25.50 MB 09 Aug 2006 - 20:51
208.66.195.15 1307 1307 24.09 MB 10 Aug 2006 - 19:56
208.66.195.22 1199 1199 21.76 MB 13 Aug 2006 - 18:41
208.66.195.4
Post Edited (22 Aug 06 5:31pm)
|