The Nuts and Bolts of Spam

Spam is one of the perennial challenges of making the Internet work. It doesn't just waste your time as you delete unwanted messages from your inbox; it wastes bandwidth, hours spent designing systems to block it, and money spent on implementing such systems. And spam has moved beyong email, now infesting Internet forums and the comment sections of blogs and popular sites like YouTube. Why is spam such a big problem, and why is it so hard to deal with?

Contents

Why spam?
Who spams?
How do they spam?
How do they make money?
Phishing
"419" scams
"Herbal Viagra" and the like
Small-cap stocks
Advertising
Viruses
Noise
How do they find you?
Buy a list
Scrape the web
Guess and check
Just guess
What can you do about it?

Why spam?

Money.

Okay, that's the short answer. The longer answer is that all forms of spam are ultimately intended to provide some sort of economic benefit (which is to say, money) to the originator. The methods, as we will see, vary some. But the bottom line is almost always money.

Who spams?

Inconsiderate people that want money.

The short answer is not quite so fulfilling here, to be sure. The immediate answer to "who spams" is "lots of innocent people" because, in fact, most spam is sent by "zombie" computers that have been compromised and are now part of a botnet. And most of them, as it turns out, are in either the United States or Europe. But, well, we'll get to that. And that doesn't really answer the question of who is really responsible for the spam.

The answer to that is not always easy. Spammers can be individuals, companies, or even organized crime syndicates based anywhere in the world. The most egregious spammers tend to come from reasonably affluent countries (affording them easy access to the computers and networks that they need to control botnets) with relatively few laws governing Internet behavior, and frequently have a history of fraud or other crimes.

Perhaps the biggest exception is the "419" scam discussed below, which is frequently linked to the country of Nigeria. Nigeria is not an especially rich nation, but has adequate Internet infrastructure and plenty of people willing to send out spam email in hopes of fleecing a foreigner.

How do they spam?

Most sizeable modern spam operations take advantage of botnets to do their dirty work. A botnet is a group of computers that have been compromised by a virus or other attack, and which now accept commands from a "botmaster". Botnets can grow quite large, with some drawing in tens of thousands of computers. The legitimate owners of the computers (which may be individuals, corporations, or even governments) are generally unaware that anything is amiss.

A spammer may build his own botnet or, in a perverse sort of free market, he may strike a deal with someone that already controls a botnet. Botnets, after all, can do more than just spam; they are also an effective tool for creating distributed denial of service attacks, for example.

That provides the hardware needed, but spammers still need to deal with the task of actually sending email. This means finding a cooperative (and not black-listed by the target) email server. Spammers like to take advantage of free web-based email services for this, both because they are free and because they have enough legitimate users that other email server administrators will be unlikely to block the entire domain. This often means getting past a CAPTCHA, which may be dealt with through a character-recognition script, actual human intervention, or (depending on the type of CAPTCHA) repeated random guesses. Botnets, as it happens, are also well-suited to the random-guess technique.

Spammers will then generally forge the return address (SMTP, the protocol used to send virtually all email, was created in a more innocent, trusting time) to make it harder to trace back to them.

Most of the same methods apply to other forms of spam, such as comment-spam on blogs, with minor changes. Comment-spam will still usually need to get past a CAPTCHA generated by blog software, and will generally include a false name and email address to add an air of legitimacy.

How do they make money?

In essence, most spam operations run on some permutation of one of a few basic ideas.

Phishing

Phishing has made a lot of headlines in recent years, in part because it is a very direct form of social engineering attack. A message tries to convince you to go to a fraudulent website that happens to look a lot like your bank's, or eBay's, or some other site, and enter your log-in information (which of course the malicious website will then store, so that the spammer can later pretend to be you). Or, the message may even just tell you to reply directly to the email with such information. These emails usually say something along the lines of "your account will be closed if you don't act now!" in order to get the recipient to act without thinking.

"419" scams

A variant on phishing, 419 scams entice the recipient to hand over money directly by claiming that in exchange, the victim will eventually get a large sum of money. They take many forms, often originating in Nigeria, but the premise is always the same.

"Herbal Viagra" and the like

Suppose that you have a product that purports to have medicinal properties, but you have no money for marketing, testing, or legal approvals. Clearly, spam is an inexpensive way for you to get the word out to lots of people. And by frequently shifting from one domain to another, you can frustrate attempts by law enforcement to shut you down. It doesn't matter (to you) that your products may not work, and may be dangerous; you'll have disappeared before anyone can complain. These scams are closely associated with Viagra and similar drugs, because the people that would like to buy such products are sometimes too embarassed to talk to a doctor in person.

As a bonus follow-on, this scam gives the evil-doer all the credit card information that he needs to ring up some fraudulent purchases.

Small-cap stocks

Particularly when the stock market is doing well, spammers will utilize "pump and dump" tactics to make money from so-called small-cap stocks. Small-cap (short for small capitalization) stocks are the stocks of companies that are very small; the total valuation of all the shares of stock in the company combined (its capitalization) is not large. The chosen stock also usually has a low per-share price, to reinforce its appearance of being a great value. Spammers buy up a decent number of shares in such a company, then start sending out messages about what a bargain the stock is. This inspires a spike of buying activity, which in turn inflates the price of the stock (pumping). The spammer then sells all of his shares (dumping) at the elevated price, thus turning a profit.

Advertising

This is yet another tactic that manipulates legitimate services. A spammer registers a domain (often hopping quickly from one to another, making use of a policy known as "domain tasting" so that he never actually pays a registration fee), then sends out email claiming that this site has something really great that you absolutely must see (given the composition of the Internet, it is perhaps not surprising that the content is often pornography). In reality the site just hosts advertising. Not long ago this was often Google Adwords; but as Google has increased efforts to curtail this practice, spammers have sought other options. All it takes is a handful of clicks per year to make this activity profitable. Multiply that small profit by the thousands of domains that a single malcontent might control, and it can be a lot of money.

Viruses

Rather than urge you to click on a link to a website, that message might instead urge you to open an attached file that contains a virus. The exact nature of the virus could be anything; typically it will either seek to steal personal information (through a keylogger that records all of of the victim's keystrokes, or by transferring important-looking files to its master) or to recruit the victim computer into a botnet (so that that computer can send out more spam and perpetuate the cycle).

Noise

Some spam emails are literally just random words with no apparent purpose. No links to anything, no attachments, nothing to encourage you to take some action, just random text. These messages are meant to make life harder for content-based spam filters, which examine the text of an email to try to separate legitimate email from spam. In electrical terms, they are raising the "noise floor" for these systems by adding "noise" that is hard to separate from the desired "signal". The net effect is to worsen the overall performance of these filtering mechanisms.

It is also possible that some of these emails are intended to test just how aggressive the spam filters of different email services are (a spammer might register a "testing address" with the service to see what sort of spam gets through).

How do they find you?

Like the telemarketer that calls you during dinner, a spammer needs to get email addresses somewhere. The methods are not exactly the same, but they are similar.

Buy a list

As with phone numbers, there are individuals and companies that collect lists of once-active email addresses and sell them to whoever will buy. While they can't guarantee that all of the addresses will be currently active, it doesn't take a very high percentage to satisfy a spammer.

Scrape the web

One very simple way to collect email addresses that are likely to be actively checked by a human being is to simply troll the web, looking for the familiar joe@example.com format. Large forums are particularly good targets, as unsuspecting users will post their email addresses "in the clear" (that is, where anyone can see them) with considerable frequency.

Guess and check

The economics of spam are such that sending 100 messages and reaching only 1 human is still not bad. So why not just send spam to completely randomly generated Hotmail or Gmail addresses? There are enough legitimate addresses on those domains that the odds of a random character string actually being someone's email address are fairly high. So spammers do that, but then listen for any activity that indicates a real human read the message. This may mean including a link in the email (perhaps one labeled "unsubscribe"), encouraging the recipient to reply to the message, or including an image in the email that is loaded from the spammer's own server (this is one of the reasons that many email clients now refuse to load external images except from trusted senders). Having identified a particular email address as "live", the spammer can target it for even more spam.

The same thing happens to smaller domains, but in a slightly different way. Spammers will try to guess addresses based on a dictionary of common ones, often leaning heavily on common first names. So if they decide to target the example.com domain, they might send emails to aaron@example.com, adam@example.com, and so on.

Just guess

Sometimes spammers don't even bother with the "check" step, and just blast messages randomly across the Internet in hopes of hitting something. The concept is the same, but the precision is even worse. This method is especially well-suited to spammers that are simply trying to increase the noise floor of email, as discussed above.

What can you do about it?

So now that you know the tricks that spammers use, a part of you may want to do something to help stop them. And the honest truth is that there isn't much you can do. The best advice is this: don't be a victim.

Don't click on suspicious links promising free porn. Use a browser with a decent security reputation (such as Opera or Firefox), and run an anti-virus application with regularly updated virus definitions. If a Nigerian prince offers you millions of dollars if you can get him out of a jam, hit delete. If an email full of garbled words and gibberish tries to entice you to buy cheap Viagra online, don't click the link to buy it and don't click the "unsubscribe" link either.