Message Board

Bugs & Development

Older Posts ]   [ Newer Posts ]
 User-Agent Strings Being Cut Off?
Author: B.Booey   (1 Jan 07 1:25pm)
Is it just me or are some of the user-agent strings on the stats pages being prematurely cut-off?

http://www.projecthoneypot.org/robot_useragents.php

If so, could the full strings be shown instead (even if it means wrapping to a new line)? Many webmasters use this information to block bots in their htaccess :o)

TIA.
 
 Re: User-Agent Strings Being Cut Off?
Author: M.Lange   (28 Mar 07 9:21pm)
Looks to me like it's a 55-character-width field. It might be truncated on the database table..
 
 Re: User-Agent Strings Being Cut Off?
Author: J.Haywood   (24 Apr 07 8:05pm)
Yes, I think that data would be very useful. I already block lots of useragents from spammers and hackers searching for vulnerabilities with htaccess.
 
 Re: User-Agent Strings Being Cut Off?
Author: M.Prince   (25 Apr 07 4:29am)
Yeah, we truncate it in the database. We may revisit this issue in the future to make the size of the field larger, but in order to index it to make lookups fast, we probably have to set some limit on its length.

Suggestions??
 
 Re: User-Agent Strings Being Cut Off?
Author: J.Haywood   (26 Apr 07 5:18am)
I developed an anti-spam plug-in for a popular content management system and the methodolgy I use at the moment is;
Data collection is stored in a mySQL database when bad activity is detected.
I then convert this raw data into an XML feed which my plug-in then converts to a html page.
Users may then see the html page which, like your own page has a limit on the character length string (to preserve bandwidth) but the raw XML feed is also avilable for them to use and indeed, my plug-in is able to connect to my master XML file held on my site and then use that to write to the users htaccess file for IP/hostname or Useragent blocking.
I also track suspect hacking activity e.g. GET and POST strings are analyzed for known vulnerabilities in a variety of software.
 
 Re: User-Agent Strings Being Cut Off?
Author: R.Laverick   (26 Apr 07 6:37am)
Not sure what DB you're using in the back end, but an index can be on a partial samples, to use your numbers in a mysql form (example inspired by of http://www.devshed.com/c/a/MySQL/Optimizing-for-Query-Speed/4/ )


CREATE TABLE t
(
name CHAR(1024),
INDEX (name(55))
);

or you could even use the text type so there is no limit.

CREATE TABLE t
(
name TEXT,
INDEX (name(55))
);

as a third alternative you could always have another table with everything stored as TEXT which is the primary data repository and then use you're current truncated table for most usage and then the full data table via common keying.

Post Edited (26 Apr 07 6:45am)
 
 Re: User-Agent Strings Being Cut Off?
Author: M.Prince   (26 Apr 07 10:33am)
I'm beyond my technical pay grade. Anyone want to talk about habeas corpus?

Anyway... I don't know why we made the decision to truncate the length of the useragents where we did. I was told there was an answer by the DB guys who are smart and not bullshitters. Usually the answer is speed and indexing. I'll look into it a bit more.

And, by the way, the backend database is PostgreSQL. I know that much.
 
 Re: User-Agent Strings Being Cut Off?
Author: L.Holloway   (26 Apr 07 2:01pm)
I believe the display truncated more than we stored in the backend! Probably a mistake on our part. Anyways, the user agent pages now display much more and this should be less of a problem. :)
 
 Re: User-Agent Strings Being Cut Off?
Author: B.Booey   (27 Apr 07 4:34am)
Ok, thanks very much, this is much better now :]



do not follow this link

Privacy Policy | Terms of Use | About Project Honey Pot | FAQ | Cloudflare Site Protection | Contact Us

Copyright © 2004–24, Unspam Technologies, Inc. All rights reserved.

contact | wiki | email