Message Board

Bugs & Development

Older Posts ]   [ Newer Posts ]
 HTTP proxies
Author: A.Daviel   (24 Feb 05 3:59am)
I was looking at robots traversing my website and found that some of the Java
ones that ignored robots.txt were coming through a Squid proxy. By default, Squid adds an X-Forwarded-For header giving the address of the original requestor. It might be worthwhile logging this information in PHP. No doubt there are others using
proxies that don't add headers; I remember trouble a little while back with elementary schools in Korea running a misconfigured Apache proxy.

e.g.
HTTP_USER_AGENT: Java/1.4.1_02
REMOTE_ADDR: 213.252.239.3
HTTP_VIA: 1.1 proxy.vianet:3129 (squid/2.5.STABLE8)
HTTP_X_FORWARDED_FOR: 10.0.5.22

HTTP_USER_AGENT: Java/1.4.1_04
REMOTE_ADDR: 62.241.130.139
HTTP_VIA: 1.1 linux.egyptnetwork.com:3128 (squid/2.5.STABLE6)
HTTP_X_FORWARDED_FOR: 163.121.176.116

HTTP_USER_AGENT: Java/1.4.2_06
REMOTE_ADDR: 81.181.121.4
HTTP_VIA: 1.1 server.sharknet.ro:3128 (squid/2.5.STABLE7)
HTTP_X_FORWARDED_FOR: 81.180.144.72
- this seems to be an open proxy (anyone can use it)
 
 Re: HTTP proxies
Author: M.Prince   (24 Feb 05 4:36am)
Interesting. Thanks for the tip. The trouble with x-forwarded-for is that it can be forged as often as it is legitimate. With v.0.2 of the scripts we plan on recording x-forwarded-for as well as tracking whether it, or remote_address appears to be reporting the correct IP for a visitor. This should both allow us to function on reverse proxies (where we currently have trouble) and to solve the problem you point out.

May be good to also record http_via for the same reason.

Thanks for the info!!

Matthew.



do not follow this link

Privacy Policy | Terms of Use | About Project Honey Pot | FAQ | Cloudflare Site Protection | Contact Us

Copyright © 2004–24, Unspam Technologies, Inc. All rights reserved.

contact | wiki | email