There are plenty of anti-spam plugins that bloggers can use to try and prevent the posting of comment spam. These vary in effectiveness, and amount of administration involved in ensuring that genuine posts are not categorised as spam or vice versa.
If you can prevent the majority of spammers from targeting your site in the first place, then you will reduce time spent on moderation and the chances of letting spam through.
Most comment spam is generated by automated software with features like these:
- Google search and link harvesting;
- options to hide IP address; and
- auto-filling of comment forms.
Requiring posters to register, and enter a Captcha before posting won’t protect you from the “better spambots”; these include pre-registration software and Captcha Cracking plugins.
The sites marketing these “tools” often deny these are comment spambots; but “if it barks like a dog and bites like a dog, it probably has fleas like a dog”.
I looked at information on 5 of these
spambot tools and found they all use the same “text Footprint” method to locate blogs. One of the most popular tools claims to search for, and auto-fill comments on, WordPress, Blogengine, Movable Type, B2Evolution and NucleusCMS blogs.
These “blog commenting tools” also allow users to add custom footprints to find other popular blog platforms; and lists of identifying footprints are easy to find.
Removing these footprints will hide your site from most spambot users, without affecting how genuine visitors find your website.
Don’t feel left out if you run a BBS, there are bots looking for your site. However; they often use different footprinting methods (see last section).
What is a Footprint?
In this context a footprint is some text that identifies the type of website e.g. a WordPress blog.
Most popular blog and BBS platforms will insert some form of tagline on your pages; those already mentioned all insert “powered by ” followed by the package name.
Default text used in comment forms is another identifying footprint.
Google indexes this identifying text along with the rest of your pages content.
Footprints continued (how spambots work)
To find targets a spammer simply includes the relevant footprint(s) and a keyword in a search e.g. ‘ “powered by wordpress” “leave a reply” automobiles ‘. In this case most results will be for pages with open comment forms on WordPress sites about cars.
Spam Bots automate such searches gathering addresses for thousands of pages.
The most sophisticated bots will also autopost the spammers comment, and may circumvent registration and anti-spam measures (video demo).
Are there footprints for the package I use?
Try a search that includes the name of the software you use e.g. ‘comments footprints drupal‘. It is also possible to find multi platform lists, try searching for ‘Scrapebox footprints‘ (Scrapebox is a “search and commenting tool”).
The entries in these lists may contain both footprints and Google limiting clauses e.g. ‘ site:.edu “what is the” “word in the phrase” drupal ‘ which will restrict Google’s search for Drupal pages with comment forms to sites on .edu domains only. You can ignore the limiting clause part of these entries.
How do I make my site invisible to spambots?
The good news: all the blog commenting software I checked used the text footprint method; and this seems to be the technique used by most all the current generation of blog spambots. So the solution is to remove identifiable footprints from your pages e.g. you could change “Powered by …” to “Running …“; and “Leave a Comment” to “Tell us what you think“.
As someone with a little knowledge, it took me about 10 minutes to modify two (different themed) WordPress blogs. See: How to remove Footprints from WordPress blogs.
The bad news: the WordPress solution won’t work for other types of CMS/blog, and some will be easier to modify than others. Popular (and therefore targeted) platforms usually have active support forums, where you can post questions on how to achieve this.
Always back-up everything before attempting any changes; and don’t get rid of your anti-spam plugin.
Is it worth modifying my blog?
If your blog is over a year old and still not plagued by spamming attempts, then probably not.
If your blog suffers from spam, or is new, then it may be worthwhile. In many cases changing its footprint identity could hide it from most users of current generation spam bots; and reduce automated spamming attempts to near zero.
Some blogs are more at risk of spam than others:
- WordPress, MovableType, Blogengine, B2Evolution, and NucleusCMS: Most mass spammers don’t have time to comment on every site, so they will want to automate as much of the process as possible. The tools I checked claimed to be able to find and auto fill comment forms on one or more of these platforms.
- Blogs allowing “do-follow” links: Automated tools may still be used to find such blogs even if it is not possible to autopost comments on them.
Do-follow links are much more highly valued by Google, and spammers may consider the time to manually paste in comments worthwhile. Links on Drupal sites are “do-follow” by default; and when researching this article I found posts highlighting its “do-followness” and discussing its footprints.
- High page rank blogs: are targets for the same reason as “dofollow”. Unfortunately, some tools come with pre-compiled lists of very high ranking sites, so footprint removal will have less of an impact for these sites.
What changing your text footprints won’t do:
It won’t result in an immediate reduction in spam. A page will continue to appear in Footprint search results until it has been re-crawled and re-indexed by Google etc.
It won’t hide your site from hackers, or from every spammer:
- There are other footprinting techniques e.g. Googling parts of a web address. This method is not as popular for comment spamming blogs because text footprints are more effective; but url footprints are often used to find BBS sites e.g. “phpbb/profile.php”. Removing these types of footprints from your site requires technical knowledge and may be impracticable.
- Some spammers monitor “news feed” syndications for pages with fresh posts (likely to have open comment forms). However, I do not advocate you switch off your Remote Publishing (RSS) feeds. Syndication improves the visibility of your site to genuine visitors; and in my own personal experience, resulting spam is miniscule in comparison to other sources.
Andy Wrigley+ has worked in IT and Computer Audit for 30 years, and loves independent travel.