Trackback Spam Gateway

It’s over. My referrer experiment is over… at least, in its current form. Today, I roll out blog.adamscheinberg.com referrer gateway version 1.0. That makes it sound fancy, but it’s not. Basically, it’s PHP to prevent trackback spam.

Traffic at blog.adamscheinberg.com has grown steadily, for some reason, and the logs reveal it: we get a TON of traffic from search engines, and the most popular terms are surprising – sensitive readers beware – here are the terms that most frequently drive people here:

cumtube, red-tube, uporn, adult youtube, milf, gay tube, tube 8 and many more equally odd terms.

You know why? Because, in a shrewd move that search engines seem to love, I display links back to my referrers, thinking they are trackbacks. But when it’s not from Google, Yahoo, Live.com, or OSNews, it’s most often spam. Why? Because not only are we using the name “tube” in our title, but with each erroneous entry, we tell the search engine it’s a good thing by back-linking to that search. In short, I’m perpetuating the problem. As a result, dozens of spammers have begun issuing basic GET requests in the hundreds placing their sites in my referrer lists.

Some time ago, I began the battle by adding rel=”nofollow” to all outgoing links not added via the admin section. But alas, that wasn’t good enough, the spammer didn’t care, so I implemented a pre-check, whereby referrers are, via regular expressions, matched against a list of known crap. As of today, there are 36 terms that I actively filter. In time, this will be performance intensive, if it isn’t already.

Thus, a gateway. Now, *all* referring traffic goes into a temp table, and each entry must be approved. I wrote a nice tool to batch import, batch delete, or even approve based on certain filters, such as domain or term. As it matures and I get an idea of time, I will “whitelist” certain domains that can immediately post to the referrer table. In the meantime, I need to decide if I want to filter referrers with obscene unrelated terms or just leave them and let the magic run its course; after all, these are not “spam,” they are simply organic mistakes. An argument could be made that it’s interesting, and therefore, mostly the reason to post referrers, to see what terms and sites around the internet drive traffic to a site.

Anyway, spammers, take note: I gotcher number! Stop referrer spamming me! That means you , you stupid lyrics sites!