1 Billion Spammers Served

Our 1 Billionth Spam Message

PUBLISHED: DECEMBER 15, 2009

On Wednesday, December 9, 2009 at 06:20 (GMT) Project Honey Pot received its billionth email spam message. The message, a picture of which is displayed below, was a United States Internal Revenue Service (IRS) phishing scam. The spam email was sent by a bot running on a compromised machine in India (122.167.68.1). The spamtrap address to which the message was sent was originally harvested on November 4, 2007 by a particularly nasty harvester (74.53.249.34) that is responsible for 53,022,293 other spam messages that have been received by Project Honey Pot.

1 billionth Project Honey Pot Message

Every time Project Honey Pot receives a message we estimate that another 125,000 are sent to real victims. Our billionth message represents approximately 125 trillion spam messages that have been sent since Project Honey Pot started in 2004.

At this milestone, we wanted to take a second to report some of our findings. Our goal is not to rehash the same old insights but instead to give a new picture that only looking at five years and a billion data points can produce.

Who Are These Spammers?

Several organizations publish regular reports on the source countries for spam. We have one of our own. The problem is that these reports tell very little about the actual source of spam messages because of the nature of how spam is sent today.

Rather than sending spam directly, spammers primarily use "bot" machines in order to effectively launder their identities. These bots are PCs that have been compromised by a virus and whose owner usually does not know they are being used to send spam. The process is not unlike the stereotypical scene in a movie where the villain keeps his phone call from being traced by relaying it through a number of connections. Similarly, spammers' use of bots can make their messages look like they are coming from somewhere completely different than their actual location. As a result, lists of spam origin countries tell you very little about where the spammers are actually located.

On the other hand, they can help provide insight into a country's security policies because they give evidence on the number of bots operating within a country's borders. Since every country will have a different number of PCs, to make this number comparable we needed to create a ratio. We decided to look at the number of compromised machines operating within the country divided by the number of security professionals operating in the country. This gives us a relative IT security score. As a proxy for the number of security professionals we used members in Project Honey Pot. Here are the results:

Best IT Security
#1	Finland
#2	Canada
#3	Belgium
#4	Australia
#5	Netherlands
#6	United States
#7	Norway
#8	New Zeland
#9	Sweden
#10	Estonia

Worst IT Security
#1	China
#2	Azerbaijan
#3	South Korea
#4	Colombia
#5	Macedonia
#6	Turkey
#7	Viet Nam
#8	Kazakhstan
#9	Macau
#10	Brazil

Because sending spam remains the primary use of bots, Project Honey Pot has a unique perspective on bot network activity. Since 2004, active bots have grown at a compound annual growth rate of more than 378%. In other words, the number of bots has nearly quadrupled ever year. In 2009, you could find nearly 400,000 active bots engaged in malicious activity on any given day with several million active over the course of any month.

Fortunately, Project Honey Pot's coverage of active botnets has grown over time at an even faster rate. In 2006, we saw less than 20% of the active bots on any given day. Today we see more than 80%.

Project Honey Pot's Bot Coverage

While tracking bots has become a critical aspect of Project Honey Pot, we remain curious where the spammers are actually located. To get at this information it's critical to look at spammer activities that are not laundered through bots. While sending email spam can easily be done in parallel (i.e., 1 million machines can send one message each) harvesting email addresses, which involves crawling web pages, cannot. This makes sense: crawling without centralized command and control will result in a lot of crawling of the same popular pages over and over again.

Our research indicates that, unlike the bots used to send spam, the machines used for harvesting tend to be more permanent, stable, and closely connected to the actual spammer's location. So where are the spammers actually located? We think the list below gives the most accurate approximation.

Where Harvesters Are
#1	United States
#2	Spain
#3	Netherlands
#4	United Arab Emirates
#5	Hong Kong
#6	Romania
#7	Great Britain
#8	China
#9	South Africa
#10	Germany

How Do They Operate?

On average, spammers today are faster than they've ever been before. The chart below indicates the average time from harvesting an email address from a web page to when the spammer sends the first email to that address.

2004	49 days 18 hours 54 minutes 15 seconds
2005	32 days 15 hours 39 minutes 41 seconds
2006	29 days 29 hours 10 minutes 24 seconds
2007	23 days 11 hours 53 minutes 03 seconds
2008	22 days 12 hours 36 minutes 54 seconds
2009	21 days 17 hours 17 minutes 28 seconds

We have found that speed is tied to the content of the message. "Product" spammers -- those selling an actual product of some kind, whether it be fake pharmaceuticals, college degrees, or mortgage loans -- tend to operate on a slower cycle, spending approximately a month gathering email addresses and then targeting those addresses with a set of spam campaigns. Product spammers tend to hold on to email addresses longer and send on average several messages a week to each address on their list.

On the other hand "Fraud" spammers -- those committing phishing or so-called "419" advanced fee scams -- tend to send to and discard harvested addresses almost immediately. The increased average speed of spammers appears to be mostly attributable to the rise in spam as a vehicle for fraud rather than an increasing efficiency among traditional product spammers.

One intriguing insight our data provides is that bad guys take vacations too. For example, there is a 21% decrease in spam on Christmas Day and a 32% decrease on New Year's Day. Monday is the biggest day of the week for spam, while Saturday receives only about 60% of the volume of Monday's messages.

Volume of Spam by Day of the Week

The chart below shows the time of day spammers are most likely to send their messages. All times in the chart are set to the East Coast timezone of the United States (GMT -0500).

Volume of Spam by Time of Day

Whom Do They Target?

Spammers are a creative bunch and we have seen a wide variety of offers show up in the one billion messages we have received. Among products sold through spam, pharmaceuticals remain the most popular. To give you a sense, we've seen the word "Viagra" spelled at least 956 different ways in order to try and trick spam filters (e.g., VIAGRA, V1AGRA, V1@GR@, V!AGRA, VIA6RA, etc.).

While spammers will often alter their messages to look different, some are remarkably consistent. The table below shows the top message FROM/SUBJECT line pairs over the last five years. We have also included our estimate of how many of each message was sent Internet-wide.

RANK	FROM	SUBJECT	EST. INTERNET-WIDE VOLUME
#1	Instant Booster	Can you afford to lose 300,000 potential customers?	100 billion
#2	Internal Revenue Service	Notice of Underreported Income	91 billion
#3	Feed Blaster	Receive hundreds of targeted hits to your website	65 billion
#4	Hit-Booster	How to get free quality visitors to your website?	51 billion
#5	Feed Blaster	Feed Blaster puts your ad right to the screens of millions	44 billion

To give you some sense, assuming an average message storage requirement of 4KB, over the last 5 years the total storage requirement imposed on the Internet by just the spammers sending the top-20 spam campaigns was over 2.5 petabytes.

Beyond the product spam, fraudulent spam increasingly dominates our spam stream. The chart below shows the relative distribution of the most phished organizations online.

Most Phished Organizations

While banks and financial institutions still make up a majority of the phishing scams circulated via spam, social networks are increasingly targeted. In 2008, there were virtually no Facebook phishing message. Today Facebook is the second most phished organization online and, if current trends continue, is on track to take the top spot in 2010.

The Future of Spam

The good news for email users is that filtering technologies have done a terrific job keeping most of the volume of spam messages out of their inboxes. Behind the scenes, however, the volume of email spam continues to grow at a blistering pace. While spam may strike the average user as a minor annoyance, the real risk it continues to pose is providing a viable business model to finance the construction of bot networks. Our research indicates that these bots are increasingly multi-purposed into vectors for new types of attacks ranging from annoyances like comment spam to real threats like denial of service attacks (DDoS).

For example, if you run a blog you are aware of the comment spam attacks your site faces every day. This new breed of spammers uses the forms on websites to post advertisements and links to pages they are paid to promote. Project Honey Pot has been tracking their behavior for two years and has witnessed its gowth in volume and sophistication.

Where Comment Spammers Are
#1	United States
#2	China
#3	Brazil
#4	Japan
#5	Russia
#6	South Korea
#7	Ukraine
#8	Poland
#9	Germany
#10	Hong Kong

Looking at the data patterns, comment spam in 2009 resembles email spam when Project Honey Pot began in 2004. While comment spammers today are tending to use a relatively limited set of machines to post their messages, if this new breed of spammers follows the email spammers' lead to massive adoption of bot networks then it will pose a significant threat to websites everywhere.