How to make a stab at blocking referrer spam in Google Analytics

Referrer spam is a newer type of spam affecting those using Google Analytics to track site statistics. Very generally speaking, it’s traffic from bots that appear in your analytics figures with fake referrer URLs that’re just spam-related sites. The idea of such is to get you to visit the spammer’s site out of curiosity. Some types of referrer spam don’t even visit the site at all (just spamming based on Google Analytics IDs).

The problem with referrer spam is it throws off site analytics figures, especially for things like bounce rates, etc. Not much of a problem for a huge site like CNN or Huffington Post, but for smaller sites (like mine), it can greatly throw things off. I don’t like checking in to see that a few spammers have caused 50% of the day’s hits to be fake visits.

As for how to combat such, there’s various types of suggestions offered online, with variable success, especially given spammers are always trying to change their techniques. Thus, while it’s not guaranteed, I thought I’d list what I’ve done (based on various sites’ advice) that’s had the most success so far.

1. Create a new view in Google Analytics

Before doing anything else, it’s best to create a new filter view in Analytics. In case something goes wrong, you’ll still have the original version of data to fall back on. To do that, go to the Administration menu at the top menu bar. Under the “View” column, click on the dropdown menu, select “Create new view,” and give it a name, such as “Filtered view.” (If needed, also create a view that’ll just be for default, unfiltered data, but leave it alone.) Once created, use this view for setting up the filters below.

2. Set up a filter for when page title isn’t set/”(no view)”

Being spammers, some of them don’t bother with filling in the name of the page visited, especially for “ghost” referral spam that avoids visiting a site (and risk getting more easily blocked). This appears in Analytics as “(no view)” (basically, null).

To filter this out, go to the “View” column (where you created the above new view), and select “Filters.” Click the “New Filter” button, and fill in the filter as below (you can name it whatever you want):

google_analytics_filter_not_set

 

Save the filter.

3. Filter out hostnames besides your own

This filter is designed to make sure traffic is actually coming from your own domain(s). Back under the “Filters” setting above, create another new filter, and fill it in as follows:

google_analytics_filter_hostname

 

You’ll need to be careful here, and make sure to enter in “Filter Pattern” any domains you may use, as well as other sites that might use such (PayPal, Amazon, etc.). Check your Analytics history for legitimate hostnames (under Audience > Technology > Network, select “hostname” from “primary dimension” in results). Enter the desired domains similar to the following:

.*example.com|.*googleusercontent.com

“Googleusercontent.com” accounts for legitimate Google services, particularly its translation features. In my case, some of my traffic’s clearly from non-English speakers using Google Translate, so I don’t want to filter those users out.

Enter all the above, and save the filter.

4. WordFence’s Advanced Blocking feature

WordFence

 

One of the security plugins I use is WordFence, which does a pretty good job with boosting site security all around. One of its features is under “Advanced Blocking,” which allows one to block visitors to your site on a range of criteria, including IP addresses, the browser’s user agent, or the referrer. While it isn’t perfect (since spammers can change the referrer constantly), it can help block some of the more persistent unwanted site visitors. Blocking on broader terms unlikely to be a legitimate visitor’s referrer (like “SEO”) might also help.

Conclusion

Again, I can’t guarantee the above will be perfect (or work indefinitely), but it’s what I’ve been using lately. Of course, Google really should do something (if possible) about their Analytics service being so easily manipulated by spammers.

Leave a Reply

Your email address will not be published. Required fields are marked *