Automated Comment Spam Algorithm
Sunday, 20 November 2005
Randfish over at SEOmoz.org blogs on the beginnings of a technique to work out how much a site relies on comments for their inbound links.
- Run a linkdomain command at Yahoo! with the following syntax; “linkdomain:url.com -site:url.com”, and record the # of results (sample for SEOmoz - 7810)
- Run a linkdomain command at Yahoo! with this syntax; “linkdomain:url.com -comment -comments -forum -reply -site:url.com” and record the # of results (sample for SEOmoz - 1770)
- Subtract Step 2 from Step 1 and divide by Step 1 - this is a rough percentage of non-forum / non-blog links to the site. (For SEOmoz 7810 - 1770 = 6040 / 7810 = 77.33% - no wonder we were in the sandbox forever…)
An additional refinement would be to work out if the site has an RSS feed. If it doesn’t then chances are it is a commercial site. However, if the site has an RSS feed, it could still be a spam site:
a) It could be a spam blog (a.k.a splog)
b) Spammer could have comment spammed a legitimate blog, which then inflates the page rank of its target site.