1 Billion Spammers Served
Our 1 Billionth Spam Message
PUBLISHED: DECEMBER 15, 2009
On Wednesday, December 9, 2009 at 06:20 (GMT) Project Honey Pot received its billionth email spam message. The message, a picture of which is displayed below, was a United States Internal Revenue Service (IRS) phishing scam. The spam email was sent by a bot running on a compromised machine in India (22.214.171.124). The spamtrap address to which the message was sent was originally harvested on November 4, 2007 by a particularly nasty harvester (126.96.36.199) that is responsible for 53,022,293 other spam messages that have been received by Project Honey Pot.
Every time Project Honey Pot receives a message we estimate that another 125,000 are sent to real victims. Our billionth message represents approximately 125 trillion spam messages that have been sent since Project Honey Pot started in 2004.
At this milestone, we wanted to take a second to report some of our findings. Our goal is not to rehash the same old insights but instead to give a new picture that only looking at five years and a billion data points can produce.
Who Are These Spammers?
Several organizations publish regular reports on the source countries for spam. We have one of our own. The problem is that these reports tell very little about the actual source of spam messages because of the nature of how spam is sent today.
Rather than sending spam directly, spammers primarily use "bot" machines in order to effectively launder their identities. These bots are PCs that have been compromised by a virus and whose owner usually does not know they are being used to send spam. The process is not unlike the stereotypical scene in a movie where the villain keeps his phone call from being traced by relaying it through a number of connections. Similarly, spammers' use of bots can make their messages look like they are coming from somewhere completely different than their actual location. As a result, lists of spam origin countries tell you very little about where the spammers are actually located.
On the other hand, they can help provide insight into a country's security policies because they give evidence on the number of bots operating within a country's borders. Since every country will have a different number of PCs, to make this number comparable we needed to create a ratio. We decided to look at the number of compromised machines operating within the country divided by the number of security professionals operating in the country. This gives us a relative IT security score. As a proxy for the number of security professionals we used members in Project Honey Pot. Here are the results:
Because sending spam remains the primary use of bots, Project Honey Pot has a unique perspective on bot network activity. Since 2004, active bots have grown at a compound annual growth rate of more than 378%. In other words, the number of bots has nearly quadrupled ever year. In 2009, you could find nearly 400,000 active bots engaged in malicious activity on any given day with several million active over the course of any month.
Fortunately, Project Honey Pot's coverage of active botnets has grown over time at an even faster rate. In 2006, we saw less than 20% of the active bots on any given day. Today we see more than 80%.
While tracking bots has become a critical aspect of Project Honey Pot, we remain curious where the spammers are actually located. To get at this information it's critical to look at spammer activities that are not laundered through bots. While sending email spam can easily be done in parallel (i.e., 1 million machines can send one message each) harvesting email addresses, which involves crawling web pages, cannot. This makes sense: crawling without centralized command and control will result in a lot of crawling of the same popular pages over and over again.
Our research indicates that, unlike the bots used to send spam, the machines used for harvesting tend to be more permanent, stable, and closely connected to the actual spammer's location. So where are the spammers actually located? We think the list below gives the most accurate approximation.
|Where Harvesters Are|
|#4||United Arab Emirates|
How Do They Operate?
On average, spammers today are faster than they've ever been before. The chart below indicates the average time from harvesting an email address from a web page to when the spammer sends the first email to that address.
|2004||49 days 18 hours 54 minutes 15 seconds|
|2005||32 days 15 hours 39 minutes 41 seconds|
|2006||29 days 29 hours 10 minutes 24 seconds|
|2007||23 days 11 hours 53 minutes 03 seconds|
|2008||22 days 12 hours 36 minutes 54 seconds|
|2009||21 days 17 hours 17 minutes 28 seconds|
We have found that speed is tied to the content of the message. "Product" spammers -- those selling an actual product of some kind, whether it be fake pharmaceuticals, college degrees, or mortgage loans -- tend to operate on a slower cycle, spending approximately a month gathering email addresses and then targeting those addresses with a set of spam campaigns. Product spammers tend to hold on to email addresses longer and send on average several messages a week to each address on their list.
On the other hand "Fraud" spammers -- those committing phishing or so-called "419" advanced fee scams -- tend to send to and discard harvested addresses almost immediately. The increased average speed of spammers appears to be mostly attributable to the rise in spam as a vehicle for fraud rather than an increasing efficiency among traditional product spammers.
One intriguing insight our data provides is that bad guys take vacations too. For example, there is a 21% decrease in spam on Christmas Day and a 32% decrease on New Year's Day. Monday is the biggest day of the week for spam, while Saturday receives only about 60% of the volume of Monday's messages.
The chart below shows the time of day spammers are most likely to send their messages. All times in the chart are set to the East Coast timezone of the United States (GMT -0500).
Whom Do They Target?
Spammers are a creative bunch and we have seen a wide variety of offers show up in the one billion messages we have received. Among products sold through spam, pharmaceuticals remain the most popular. To give you a sense, we've seen the word "Viagra" spelled at least 956 different ways in order to try and trick spam filters (e.g., VIAGRA, V1AGRA, V1@GR@, V!AGRA, VIA6RA, etc.).
While spammers will often alter their messages to look different, some are remarkably consistent. The table below shows the top message FROM/SUBJECT line pairs over the last five years. We have also included our estimate of how many of each message was sent Internet-wide.
|RANK||FROM||SUBJECT||EST. INTERNET-WIDE VOLUME|
|#1||Instant Booster||Can you afford to lose 300,000 potential customers?||100 billion|
|#2||Internal Revenue Service||Notice of Underreported Income||91 billion|
|#3||Feed Blaster||Receive hundreds of targeted hits to your website||65 billion|
|#4||Hit-Booster||How to get free quality visitors to your website?||51 billion|
|#5||Feed Blaster||Feed Blaster puts your ad right to the screens of millions||44 billion|
To give you some sense, assuming an average message storage requirement of 4KB, over the last 5 years the total storage requirement imposed on the Internet by just the spammers sending the top-20 spam campaigns was over 2.5 petabytes.
Beyond the product spam, fraudulent spam increasingly dominates our spam stream. The chart below shows the relative distribution of the most phished organizations online.
While banks and financial institutions still make up a majority of the phishing scams circulated via spam, social networks are increasingly targeted. In 2008, there were virtually no Facebook phishing message. Today Facebook is the second most phished organization online and, if current trends continue, is on track to take the top spot in 2010.
The Future of Spam
The good news for email users is that filtering technologies have done a terrific job keeping most of the volume of spam messages out of their inboxes. Behind the scenes, however, the volume of email spam continues to grow at a blistering pace. While spam may strike the average user as a minor annoyance, the real risk it continues to pose is providing a viable business model to finance the construction of bot networks. Our research indicates that these bots are increasingly multi-purposed into vectors for new types of attacks ranging from annoyances like comment spam to real threats like denial of service attacks (DDoS).
For example, if you run a blog you are aware of the comment spam attacks your site faces every day. This new breed of spammers uses the forms on websites to post advertisements and links to pages they are paid to promote. Project Honey Pot has been tracking their behavior for two years and has witnessed its gowth in volume and sophistication.
|Where Comment Spammers Are|
Looking at the data patterns, comment spam in 2009 resembles email spam when Project Honey Pot began in 2004. While comment spammers today are tending to use a relatively limited set of machines to post their messages, if this new breed of spammers follows the email spammers' lead to massive adoption of bot networks then it will pose a significant threat to websites everywhere.
To counter these increasing threats, web administrators need to continue to share data about attacks they see on their own sites through efforts such as Project Honey Pot. Over the next year, we will be launching a number of new initiatives to increase the protection we offer. In the meantime, if you run a website, we encourage you to become a member of Project Honey Pot today and encourage others to do so as well. Only by working together do we stand a chance to face the challenges that lie ahead.
Finally, thanks to all the current Project Honey Pot members as well as the organizations that have helped us build our infrastructure.