Message Board

http:BL Use/Development

Please comment on proactive spam killer

Author: M.Baynton (18 Jan 09 1:21pm)

Hello all,
I'm looking for feedback on an idea I have for creating an allied network of forums/blogs that would detect and provide realitime data about spammers as they surface. The idea goes something like this:

The tagline is that it would enable a particular blog/forum (say, "forum X") to make use of project honeypot data on a particular IP even when information on that IP only became available *after* the IP had posted to "forum X." What I propose is to construct a forum spam bulletin network that proactively lets forum & blog software know when a new spammer has surfaced. Currently, a forum has to query http:bl if to see if IP has been identified at the time when that IP is registering/posting to the forum. Under my proposed system, in addition to querying http:bl for existing known spammers, forums would actually be notified about new spammers as they were discovered. Forum admins could configure their forum software to handle these spam bulletins however they wanted, but potentially this system addresses the greatest weakness of http:bl - spammers have free reign of project honeypot protected forums until someone detects and observes bad activity, and any damage they do during that time must be manually cleaned up. With added spam bulletins, your forum software could also delete messages and ban users that had already gotten into your forum and were only added to the http:bl database afterwards. The spam might live on your forum for a short while, but it automatically gets deleted with no manual admin intervention required. Potentially, we could also add communication from forums back to the network, ie whenever a post occurred or a user registered on any participating forum, a central service would be notified and could then detect and send bulletins on spamlike trends, such as rapid-fire posting from a particular IP.

Beat THAT, spambots.

Of course, this does also empower clever outsiders to delete content from forums to a certain degree - if a single individual created enough bogus reporting accounts and reports of bad activity by a certain ip it could cause it to get deleted from subscribing forums. However, I don't think this is a real problem, because there would be nothing to be gained by doing this, most netizens are nice (see Wikipedia), it would be a lot of work to pull off, and even if somebody did it, it wouldn't do a whole lot of damage if targeted at legitimate users, since their posts would likely only appear on one forum in the entire network. This would be further mitigated by only deleting/banning users and posts if they were created in the last week or so, to prevent ancient, legitimate posts from randomly getting deleted if the ip address used to post them was later taken over by a bot. So clearly it would do a lot more good than harm.

Technically, a high-level overview is:
I can develop the software for and, if necessary, maintain some of the servers involved in the forum spam bulletin network (FSBN). Forum admins could opt in to FSBN, providing information including how many times they wanted an IP to be flagged as spamming before the FSBN should notify that admin's forum software. Whenever a report was received at project honeypot, it would forward the report as well as project honeypot's aggregate spaming-frequency data to a FSBN "New Report" web service (a simple http request would do this just fine.). Whenever the FSBN received a new report from project honeypot, it would forward the report to a handful of bulletin distribution nodes, which would be a distributed network of computers, each of which would be responsible for notifying a subset of all opted-in forums/blogs. Each bulletin distribution node would have a local database of the subset of forums it was responsible for as well as notification threshold data as specified by that forum's admin. Each time a new report arrived at a bulletin dist node, it would compare the spamming frequency for the ip in the report to the thresholds of each forum, and send the spam bulletin to the appropriate forums via an http connection. This system should be able to notify many thousands of forums of a new spammer in only a few minutes, or much less with a larger number of bulletin distribution nodes.

Forum-side, mods could be written for popular forums and blogs that handled the incoming bulletins. This should be relatively easy to implement for web publication platforms, since no cron jobs or anything would need to be set up on the forum sites themselves - just an added cgi script.

Please comment. Thanks.