Message Board

Tracking Harvesters/Spammers

Older Posts ]   [ Newer Posts ]
 MJ12bot
Author: A.Chudnovsky   (2 Jun 07 12:29pm)
Hi,

I am a founder of the project that attempts to build a WWW scale search engine that can rival best of the best. Naturally crawling is one of the key activities that we have to undertake and this is done using distributed model much like distributed.net and SETI@home, more details about our project here: http://www.majestic12.co.uk/

It has been brought to my attention by some of our users that some of their legit IPs were classified as spammers, which they are most certainly not - it is possible that some bad folk use fake user-agents, so I hope you have some kind of mechanism not to be trigger happy and whack the good guys like ourselves.

regards

Alex
 
 Re: MJ12bot
Author: A.Z4   (6 Jun 07 6:25pm)
Alex,



Your bot(MJ12bot) gets in to spider trap. does not follow robots.txt all the time.

Just picture this: there is a page with with a pixel link on it. human can't see it, you are not Google(yet) . So if your bot follows the pixel link, it gets banned and reported.

It is possible that some bad folk use fake user-agents, they do it all the time. Unfortunately if someones bot comes to a site and claims to be nice these days comming from Dynamic IP Ranges, it would certanly get banned on the sites like mine in a heart bit.


The user agent could say it is a SLURP bot, but unless there is a way to verify it, its not a slurp.

UAs


MJ12bot/v1.0.4 (http://majestic12.co.uk/bot.php?+)
MJ12bot/v1.0.7 (http://majestic12.co.uk/bot.php?+)
MJ12bot/v1.0.8 (http://majestic12.co.uk/bot.php?+)
MJ12bot/v1.2.0 (http://majestic12.co.uk/bot.php?+)


IPs

129.241.111.168
193.64.31.23
205.209.170.162
205.209.170.172
205.209.170.177
205.209.170.201
205.209.183.161
212.191.65.241
213.115.58.34
213.216.247.249
213.84.192.184
216.105.213.176
66.108.41.54
67.22.9.58
67.68.226.158
67.71.157.176
68.185.24.2
68.48.242.29
69.159.10.232
69.159.37.151
69.243.48.152
71.168.107.138
74.135.126.189
81.167.8.147
82.52.23.150
82.99.36.100
84.184.147.193
84.248.180.84
84.48.34.91


you might also want to programm it to :

once it gets a 403 to stop crawling the site.


check whois on
205.209.183.161(Managed Solutions Group, Inc.) Hosting
205.209.170.201(Managed Solutions Group, Inc.) Hosting
68.185.24.2(Charter Communications MDFRD-OR-68-185-0) Dynamic


what do you expect?
 
 Re: MJ12bot
Author: A.Chudnovsky   (31 Oct 08 2:38pm)
Hi,

We are obeying robots.txt and do so if we get 403 error when requesting robots.txt - our current user-agent is
MJ12bot/v1.2.3 (http://majestic12.co.uk/bot.php?+) anything below 1.2.1 is fake and we are not responsible for their behavior.

We do not have single IP range due to distributed nature of crawling, we are obeying robots.txt however including Crawl-Delay, so I feel making claims that we are a bad bot are not right and put this project into disrepute because if you ban bots that obey robots.txt on the basis that we don't have fixed IP range.

> what do you expect?

Our reasonable expectation is that people should take into account all facts and avoid having kneejerk reaction on the basis of some fake bots that have nothing to do with us. We are acting honestly by showing our user-agent all the time, obeying robots.txt - our crawl is measured in many billions of pages, it is not right to pick one incident (especially if it comes from fake bots) and then blame the whole thing.

Anyone can check our bots page, there is email contact address there that we always read, we obey robots.txt pretty well so it would be nice if after all this effort we were not being mixed with spammers and other evil bots that don't even bother trying to support robots.txt, fake user-agents etc.

regards,

Alex
 
 Re: MJ12bot
Author: J.Editor   (31 Dec 15 1:59am)
MJ12 is a thorn in my side. If, as stated, that robots.txt is obeyed by mj12bot and never returns when given a 403, why does this hated bot keep returning? When I told the aforementioned A.Chudnovsky, he/she claimed not to know the URL of website and when issued with a cease and desist order has been returning on a daily basis from the following IP address 195.191.106.73 host aston73.majestic12.co.uk

I, for one, do not find the claims that Chudnovsky makes on all kinds of fora defending the actions of mj12bot credible. I have not given Chudnovsky or anyone else at their organisation consent to use my site for them to generate income or whatever from my intellectual property and hard work.

I am finding the behaviour of Chudnovsky particularly harmful to my health and well being and am wondering if a class action lawsuit against him/her and majestic might get rid of this truly hated and disruptive organisation once and for all.

Hit Chudnovsky with a cease and desist order and that Chudnovsky will face a penalty charge for each visit and file legal proceedings to recover the debt. That will generate some publicity that will prove damaging for MJ12 and Chudnovsky!
 
 Re: MJ12bot
Author: J.Editor   (25 Aug 16 8:33am)
No-one wants MJ12 bots near their sites and, like others, am sick of the utter garbage spouted in defence of MJ12. If MJ12 was a legitimate operation, after all these years of nothing to publically show, can hardly be seen as beneficial to anyone except the pondlife running it. I contacted the scumbag who runs it and received less than friendly responses and he carried on harassing me. Go to hell.

vexating web site owners, stealing bandwidth and using dubious web hosts to keep attacking sites



do not follow this link

Privacy Policy | Terms of Use | About Project Honey Pot | FAQ | Cloudflare Site Protection | Contact Us

Copyright © 2004–17, Unspam Technologies, Inc. All rights reserved.

contact | wiki | email