Author: A.Daviel (24 Feb 05 3:59am)
I was looking at robots traversing my website and found that some of the Java
ones that ignored robots.txt were coming through a Squid proxy. By default, Squid adds an X-Forwarded-For header giving the address of the original requestor. It might be worthwhile logging this information in PHP. No doubt there are others using
proxies that don't add headers; I remember trouble a little while back with elementary schools in Korea running a misconfigured Apache proxy.
HTTP_VIA: 1.1 proxy.vianet:3129 (squid/2.5.STABLE8)
HTTP_VIA: 1.1 linux.egyptnetwork.com:3128 (squid/2.5.STABLE6)
HTTP_VIA: 1.1 server.sharknet.ro:3128 (squid/2.5.STABLE7)
- this seems to be an open proxy (anyone can use it)