Message Board

Newbie/Basic Questions

Older Posts ]   [ Newer Posts ]
 Directory Dependent?
Author: C.Pettinger   (29 Dec 04 5:04pm)
I've noticed something interesting recently with a bot pass and I thought I had a simple solution but now I wonder if I broke something. (The names below have been changed to protect the innocent.)

Say I have my pot installed at www.example.com/cgi-bin/booger.cgi and I have links all over pointing to that. I have /cgi-bin in my robots.txt exclude but obviously harvesters (and cyveillance) choose to ignore robots.txt.

However, I just watched a bot go through and grab everything, including other directories prohibited in robots.txt, EXCEPT the cgi-bin directory. For whatever reason they stayed out of there.... So they never hit the pot.

So I figured, fine, I'll set up another cgi directory alias in httpd.conf at www.example.com/getme/ and I put a copy of booger.cgi into the /getme/ directory (actually the aliased directory). I put some links to this around my site.

But I don't think I can do this, can I? Does the script need to be at the registered location -- in my case cgi-bin?
 
 Re: Directory Dependent?
Author: M.Prince   (29 Dec 04 8:47pm)
Wow, that's interesting. Did some of the other pages the robot picked up end with .cgi? It could also be that some robots are ignoring that extension. It could also be that the robot is using Google's information or some old map that doesn't include the honey pot page.

In terms of the directories, you can do a couple of things. You can set up an alias for cgi-bin just with a different name, and make the links to the honey pot go through that so long as the original cgi-bin link also works (so we can periodically check to make sure it's behaving correctly with our own spider). So, for example, both of the following URLs will talk you to the same honey pot:

http://www.unspam.com/a/complementary.cgi
http://www.unspam.com/fight_spam/complementary.cgi

(PS - I don't encourage people to generally post honey pots on this board, but this one on Unspam has actually been pretty widely reported as an example, so.....)

You should be able to do this by creating a new script alias in your httpd.conf file. Something like:

ScriptAlias /getme/ /usr/local/apache/cgi-bin/

If, on the other hand, you want to setup a completely new script directory it's a bit more complicated. In that case, you'll need to login to your account and go to the Manage Honey Pots page. Click on the honey pot you're dealing with and then click the "change script location" link below the current SCRIPT LOCATION. You'll have to confirm that you want to move it. Then go into your server and move the script to the new location. The script will be put into an unconfirmed mode. Access it through a web browser in the new location and you'll get a page including an ACTIVATE link. Click that link and the script will be officially moved.

Now, what could go wrong? If you're using the Movable Type Plugin and your MT installation uses cgi-bin as the default directory then it might be tricky. The MT Plugin makes some assumptions about the location of the script being the same as the rest of the MT files. That means that in order to move it you might need to move your entire MT installation. That'd be a big pain, and you may not want to do it if robots really are ignoring .cgi files in the cgi-bin directory. Do note that you can still move the honey pot, you'll just need to manually add the links instead of allowing the plugin to add them for you.

Hopefully that all makes sense. Keep watching your logs and see robots continue to ignore the page. If you don't mind, report back here and let us know what happens. Your experience may change what we recommend people do.



do not follow this link

Privacy Policy | Terms of Use | About Project Honey Pot | FAQ | Cloudflare Site Protection | Contact Us

Copyright © 2004–25, Unspam Technologies, Inc. All rights reserved.

contact | wiki | email