Spam Part Two - Attack of the Clones
Tuesday, April 8th, 2008This post was written by our Chief Spam Fighter and delves into the subject of why spam is such a tricky little beast. It was prompted by a question that a Searchme user posted at getsatisfaction.com.
In Spam Part One, I touched on the adversarial nature of spammers, how they cheat by yelling and shape-shifting. Now let’s discuss the second reason why spam is a particularly tricky problem: The numbers.
First of all, there is just so much darn spam out there. Billions and billions of pages. Dealing with the sheer mass of it is a never-ending, soul-wearying battle.
Second of all, spammers multiply like the devil. Say each person in our stadium represents one good site. Well, the spammers in the crowd have found a way to clone themselves, so what looks like a whole end zone full of people could in fact be one bad spammer. This cloning process is so fast and so cheap that even if we cleared out the area at half time, the area would be filled again by the third quarter.
Here’s an example to illustrate this point: We once found a spam site that led to 381 billion pages. One domain created a flood of spam pages that was more than ten times the size of Google’s index.
That’s the kind of enemy we’re dealing with.
Next time I’ll post about how hard it is to distinguish what is and is not spam (even though they’re everywhere.)
Tomorrow: Spam Part Three - Babies with the Bathwater



