Message Board

Bugs & Development

Older Posts ]   [ Newer Posts ]
 RSS Feeds for Stats
Author: Y.Shafranovich   (19 Jan 05 1:54am)
I would like to suggest adding RSS feeds for the honeypot statistics.
 
 Re: RSS Feeds for Stats
Author: M.Prince   (19 Jan 05 2:10am)
Do you want the RSS feed to publish the list of the Top 25 Harvesters on another site? Or do you want it in order to create a "banned list" tool?

We hadn't thought of people wanting to do the former. If there's interest, we can definitely look into creating something like that.

In terms of the later, we're working on it. We will probably not use RSS exactly to distribute the list, but the idea is to create a list of banned IPs that website admins can check against when a visitor comes to their sites. If the visitor is a known harvester then the admin could choose to block them, require them to go through a gateway page (e.g., a CAPTCHA or maybe even just a Javascript redirect spiders would be unlikely to parse), or maybe automatically strip any email addresses or other functions (e.g., POST) from the pages served through the web server.

If we can ensure it doesn't consume too many resources (i.e., get too expensive), our desire it to make such a tool available for free to anyone who installs a honey pot on their site, donates some MX records, refers a bunch of users to us, or is otherwise an active member of the Project Honey Pot community.

Our working name is the http:BL. We've begun the coding of an Apache module. We'd like to make one for IIS and some other webserver platforms as well. If anyone has any interest in helping with this part of the Project (especially if you have expertise in Apache modules or IIS plugins) please contact us and we'll get you involved!!

I know that's more of an answer than you were probably looking for, but if this is why you're looking for an RSS feed then help is on the way. If you (or anyone else) are looking for it to publish data on your server then I'll look into if we can set something up.
 
 Re: RSS Feeds for Stats
Author: B.Pennypacker   (19 Jan 05 3:53pm)
It'd also be nice to have a DNSBL of IP's that spam was detected originating from. That'd make it easy to add rules to SpamAssassin and other spam filtering tools.

-Bruce
 
 Re: RSS Feeds for Stats
Author: M.Prince   (19 Jan 05 4:19pm)
Yeah, we're thinking about that. We have agreed to begin feeding domains to the SURBL that we see in our messages, so that should be of some help. (That should be up and running as soon as we have time to come up for air and debug a couple issues we discovered in our link parser.)

We so far make it challenging to get the list of spam server IPs. There's a rationale to this: we want to avoid abuse. I worry that if a spammer installs a honey pot and gathers a bunch of our addresses they'll then turn around and give them to e-newsletters and other legitimate sites. Even if they're doing a confirmed opt-in, the opt-in messages (especially if multiple email addresses are submitted) could trigger a response.

There may be ways to deal with this. For example, maybe we wait until a single server has sent at least two messages to the same address (sort of dealing with the newsletter problem). Maybe we just hope that with enough data things work themselves out and abuse becomes statistically insignificant to the point that we can just lop off the bottom of the data and be safe. I don't know, but I want to think carefully before we start feeding our data irresponsibly. Spam is bad, but it's even worse when rogue users can use anti-spam tools to cause real havoc for legitimate users.

Two things to note, on the other hand. The problem of abuse is not nearly as pervasive with identifying harvesters. Imagine again the e-newsletter scenario. If the rogue user starts picking up our addresses and handing them out to legitimate e-newsletters then the rogue user has effectively become a harvester. Even though the messages that are coming through as legit, we're still able to nail the initial abuser for "harvesting."

Second, the SURBL is attractive and worth checking out if you don't already know about it because of the efforts they take to avoid false positives. By maintaining an extensive and growing list of "white" domains, the risk of abuse causing long-term problems is greatly reduced. More information on the SURBL is available here:

http://www.surbl.org/

It is built into the latest version of SpamAssassin and, from reading user reports, is on its own surprisingly effective at dealing with spam. Combined with other SA measures, it's about as good a filtering solution as any.
 
 Re: RSS Feeds for Stats
Author: C.Byrum   (20 Jan 05 1:32pm)
Your wisdom is definitely welcome Mr. Prince. I would say that most of the data you've collected should remain closely held. As the goal of the project is to provide legal means to combat spammers, it doesn't make much sense to defame them publicly before trial. That sort of thing gets your case thrown out (and no, I am Not a laywer).

I think what Mr. Shafranovich wanted was just an RSS feed with the aggregate statistics. This would be cool for us users of Firefox, as we could quickly look at our live bookmarks and see if any spammers had been caught. This would also be nice to have on our websites where we could say "hey go to project honeypot, here are their stats.."
 
 Re: RSS Feeds for Stats
Author: N.Jackson   (23 Jan 05 3:02pm)
I would like RSS feeds for my own stats, so for example I could say "I have helped catch 6 email harvesters"
 
 Re: RSS Feeds for Stats
Author: B.Janssen   (25 Jan 05 5:48pm)
Hi there!

For this issue, I would like to point the project maintainers to this page:
http://www.xmlhub.com/rssgenr8.php
(source code: http://www.xmlhub.com/rssgenr8.zip)

"RSSgenr8 is a hosted HTML to RSS Scraper Tool which dynamically generates a RSS feed from a HTML web page. Changes to the web page are then automatically reflected in the RSS feed."

All you (project maintainers) would have to do for this feature request is to:

1: put:<span class="rss:item"> ... </span> Around each item where you want to generate a feed item for (in the stats pages t_*** 's html)

2: host the rssgenr8.php on the project's page.

Then the 'personal stats' pages:
http://www.projecthoneypot.org/t_****

(where *** is your uid nr)

Would be available as real time RSS Feed in every RSS aggregator trough the URL:
http://www.projecthoneypot.org/rssgenr8.php?pageurl=http://www.projecthoneypot.org/t_****

Somebody (with more php skills than me) could even make a 'in-between parser' wich puts the right <span> tags at the right place, if the project maintainers don't have the time to do it?? I would be happy to host the whole RSS thing on my servers. (I work at a dutch ISP)

Just my 2 cents on supporting RSS user stats ;-)
 
 Re: RSS Feeds for Stats
Author: M.Prince   (25 Jan 05 7:35pm)
Hmmm.... not sure how much I want to encourage too much scraping of our site, but if you're really eager. :-)

We'll definitely work on RSS support. We're trying to get support for some more scripting languages online (mod_perl, Python now online; ASP, ASP.NET coming soon). After that we'll work on RSS and some other cool features.

Glad people are excited to share their stats!
 
 Re: RSS Feeds for Stats
Author: Y.Shafranovich   (5 Feb 05 11:40pm)
I wanted an RSS feed of my own stats
 
 Re: RSS Feeds for Stats
Author: C.Kruslicky   (16 Feb 05 8:44pm)
aside: http://www.surbl.org doing whitelisting probably explains the weird links I have been seeing in recent spam, links that appear to be generated based on the email domain and such. I had assumed it was just for bayesian filters.
 
 Re: RSS Feeds for Stats
Author: L.Veltkamp   (22 Aug 06 7:49pm)
I know this thread is kind of old, but I've thought that RSS feeds for my personal stats would be nice. I want to include some general information about the project on my site and being able to directly incorporate those stats would be useful.
 
 Re: RSS Feeds for Stats
Author: M.Prince   (25 Aug 06 11:16pm)
RSS feeds are now online. See the other thread for a full description.



do not follow this link

Privacy Policy | Terms of Use | About Project Honey Pot | FAQ | Cloudflare Site Protection | Contact Us

Copyright © 2004–24, Unspam Technologies, Inc. All rights reserved.

contact | wiki | email