If you have a website, or run websites for clients, chances are you have installed Google Analytics. There’s also a pretty good chance you see some weird things come up, like websites you’ve never heard of that somehow are sending you traffic.
This has been an issue for well over a year, in fact, I wrote a post about eliminating referral spam, back in January of 2016. Unfortunately, these damn spammers are getting more creative!
While much of what I posted back then is still relevant, it is no longer complete.
In this post you will learn:
- My step by step method of killing referral spam
- How to future-proof your Analytics
- Dirty tricks these spammers will use
- What methods no longer work
Getting Started
When I was doing my monthly income report, I was pretty excited at first. I saw that I was getting traffic from Mozilla.org, reddit, and LifeHacker. I was wondering how this would be happening, especially in a time that content production has been almost non-existent.
Seeing 300 visits from Mozilla, 700 from reddit, and 800 from LifeHacker really got me excited, until I looked a bit further.
All the visits came from a “safari” browser… Now, I’m not an expert in statistics but I believe that is most likely considered “statistically improbable” at the very least.
I wish I had that traffic (and links!) but I’m interested in accuracy, even if it makes my numbers lower because it’s important to get a clear picture on what’s going on.
Don’t Do This…
As I mentioned in my earlier post, there are a few suggestions getting floated around STILL to this date, that do not work on cleaning up your analytics.
These non-working suggestions include:
#1 – Block the sites in your htaccess file.
In order for htaccess to ever work, someone needs to access the site. Even if you block a specific IP or referral path from accessing your website, the first thing it loads is the htaccess file, in order to block them.
The reason this doesn’t work, is because most of the referral spam out there, never visits your website.
It uses a bot, or some mechanism to spam Google analytic ID’s, not your website, just the ID’s.
For example: UA-4572846-1 could be an ID. It wouldn’t be hard for there to be a bot that goes through numbers 100,000-1,000,000,000 to find all the possibilities of websites to spam their link in Google Analytics. The bot will them simulate a visitor probably using some internal script and blast all the GA ID’s.
When it blasts the ID’s… a referral gets left in your analytics. Blocking it via htaccess is useless.
#2 – Use the referral exclusion feature in .js tracking
Here are the steps people would typically take:
Click on “admin” inside Google Analytics
Click on the option .js tracking info
Click on Referral Exclusion List
Add referral spam url
Eventually, these people would see the referral spam go away from their reports. Unfortunately, it isn’t really gone.
The referral exclusion option prevents the referrers from showing up under referrals, but instead includes it as part of your “direct traffic” category, which still skews your overall metrics.
Neither one of those options are useful in dealing with referral spam.
Even if it did work, you wouldn’t want to eliminate sites like reddit from showing in your analytics, right?
THE ONLY OPTION THAT WORKS
This process is borrowed from my previous post, with a few adjustments. Previously, all you had to do was add a segment and enter the problem domains.
Since spammers are getting better, and using popular sites for some reason, we need to be able to allow those sites to be displayed, so there are a few additional steps we have to take here.
So here’s what you do:
Step 1: Add a New Segment
This can be done on almost any page. I’m always looking at the Acquisition section but you can add segments on nearly any page in analytics.
Step 2: Tinker With Some Settings
Here’s a screen shot so you can copy the settings I use:
It’s very important you use all these options. Source, matches regex, exclude, sessions.
I leave the demographics, technology and everything else on the side menu blank. The only thing I care about with this filter, is conditions, so I can block the referral spam showing on reports.
Important Note: Adding Referral Spam To Your Filter
When adding domains, you would need to put a \ before the .com so if you were to block domain.xyz from your reports, you would add domain\.xyz and separate each domain with a | so domain\.xyz|domain\.abc|abc\.com
Step 3 (NEW): Add Another Condition for HostName
This wasn’t always necessary, but it was a good way to exclude a lot of new sites that pop up as spam. These spammers wouldn’t have your domain set as the hostname. So all you do, is click the add button on the first filter.
Now you need to go through the drop down, select behavior, and select hostname. There, you can add your domain and you will eliminate the majority of referral spam, and even search/social (non-legit) spam that doesn’t use the correct host name.
Using those settings above, will help with a majority of the issues you come across.
However, there’s one more step.
Step 4 (NEW): Add ANOTHER filter for Network Domain
This is a trick you will have to use for a while when dealing with ghost spam from what looks to be legitimate sites. For example, if you’re getting traffic from facebook, but it looks fake, you can easily verify what traffic is real and what isn’t.
For example, if you’re in “acquisition” in Google Analytics, you can simply select secondary dimension, click users, and click network domain to see who is coming from a legitimate service provider.
Most of the referral spam comes from Russia. So if you see a .ru domain, or a foreign ISP, chances are it is spam.
Here’s how you deal with that:
To add the network domain condition, you will this time click on user instead of behavior, and find the network domain option, just like you would in your main acquisition page when looking for secondary dimensions.
This option allows you to filter out all future spammers that don’t spoof their network domain.
Step 5: Name, Save, and View
So this is the easy part. You name what you want the filter to be called, I have it as All Sessions -referral spam. Click save, and then view the segment. I always have that segment on so I get true metrics each time I look.
Final Thoughts
This type of spam is relatively new. It has only popped up in the last month or so for me. The generic source filter still works for most but there are some clever spammers that have worked around it.
Hopefully the solutions given in here will last another year before I have to edit again. However, I will keep this post as the “master” guide on removing referral spam and will continue to update it as needed.
If you have any clever ways to combat referral spam, please share in the comment section!
First of all, thanks for taking the time to write all of these value rich articles.
Secondly, I noticed on the website that I’m doing seo for; it doesn’t have many backlinks yet, but when I went on majestic, I’m seeing links that I didn’t know where they came from that I didn’t put on there.
I’m going to use your tips to keep the spammers at bay.
You now have another follower.
Hey Cary, thanks for stopping by, I’m glad to have you as a new reader!
Do you have Webmaster tools for the site? I’m curious if they’re reporting in there. Maybe the domain expired at some point before you or the current owner registered it.
I was getting a lot recently and It was throwing out my analytics. After reading this post I searched for an alternative and I came across a premade list by loganix, you just click the link and it automatically adds a list to analytics.
It seems to be working well. Although one or two are getting through until an update (which is monthly). I’d highly recommend you look it up.
Cheers Income Bully
The problem eventually becomes that the spammers are using other methods instead of just domains. For example, you’ll have to go deeper into it, and remove certain host names or even referring countries altogether.