Spam Part Three - Babies with the Bathwater
This post was written by our Chief Spam Fighter and delves into the subject of why spam is such a tricky little beast. It was prompted by a question that a Searchme user posted at getsatisfaction.com.
Now that we know how spam fights, cheats and multiplies, let’s talk about why it’s a particularly tricky problem from the search engine side of things. Namely, we have to be really careful that we only remove spam and not the good sites.
Let’s drag out the stadium analogy one last time. We know the end zone’s full of spammers, but what if Baby Kylie and Aunt Millie happen to be sitting there as well? We obviously don’t want to get rid of them, so we can’t just blast the area.
At Searchme, we strive for a zero error rate - not one site mistakenly removed from our search results for being spam; not a single baby with the bathwater. While almost all other aspects of search are relatively mistake-tolerant and work well ‘on average’, identifying spam does not. ‘On average’ could mean that a search engine was wrong up to one-half of the time, and nobody can afford to be 50% wrong when it comes to identifying spam. So we have to look at every site very closely - no big sweeps.
Also, what if some of the hooligans look an awful lot like Aunt Millie? We have to be so cautious about not getting rid of what may be a good site that sometimes a spam site won’t be identified. (The good news is that since this is a two-way adversarial street, a site missed today will be found tomorrow.)
So, this is why you still find spam pages in search engines, despite our best efforts. Spammers use every dirty trick in the book, knowing that we, the search engines, have to be very careful in how we get rid of them.
The good news is, we won’t give up the fight.

April 15th, 2008 at 4:12 pm
One of the drawbacks to showing a single result per search term, is that the “Spammy Bird Gets The Worm”.
For example, in Google I’m ranking very highly because I’m NOT using Black Hat techniques to achieve search engine placement.
You folks have created a visual template that needs to be expanded to perhaps 6-8 results per set of search terms (use thumbnails of the sites from Screen Shots or other archive service).
Best of luck on your Beta, I hope you succeed in your venture as Google could always use some competition….LOL