How to Identify and Remove Bot Traffic in Google Analytics
If you’re a regular user of Google Analytics, you may have come across bot traffic at some point. You can usually spot a bot as a spike in traffic with no immediate explanation, unusual characteristics and often with very similar attributes.
For example, a surge in bot traffic might appear as a large number of hits from a specific location in an unusual part of the world, or it might have a large number of ‘not set’ attributes.
In general, if you receive a lot of hits in a short space of time, and you don’t know why – a bot may have visited you. There are both legitimate and malicious reasons why you might see bot traffic in Google Analytics data, and there are ways to tackle it too.
The good guys and the bad bots
Bots are a fact of life on the internet. They are used for all kinds of automated tasks. ‘Bot’ is an abbreviated form of ‘robot’ and even the search engine robots that crawl and index your pages are a kind of bot – and you wouldn’t want to block them from accessing your site.
However, there are also malicious bots. These can be used for nefarious purposes, for example:
- Denial of service attacks, which attempt to overload your server with a vast number of simultaneous connections.
- Analytics data spamming, which floods your data with spammy information such as a specific referrer URL.
- Website scraping, in which the bot crawls to each page in turn and duplicates your website content as a means of copyright theft.
Preventing bots from gaining access to your website is one issue, but if you occasionally receive visits by bots, removing that data from your Google Analytics reports is another important task. Doing this will ensure any web marketing and SEO campaigns are based on accurate info.
Exclude bot traffic in Google Analytics
Google Analytics excludes bot traffic by default, but it’s worth knowing where this setting can be found, in case you ever want to check that it’s activated, or even deactivate it for testing purposes.
First of all, visit your Google Analytics Settings page. One way to get there is by clicking the Settings cog at the bottom-left of your Google Analytics dashboard.
Choose the relevant Account, Property and View from the dropdown lists. Then, under the View column towards the right of your screen, click on View Settings.
Scroll down if necessary – towards the bottom of the View Settings page, you will see ‘Bot Filtering: Exclude all hits from known bots and spiders’, and if this box is ticked, Analytics is already removing known spammy bot sources from your reports.
Remember this only filters out bot sources that are known to Google. Also be aware that if you are working to secure your server against bot traffic, you might want to untick this box so you can see all successful bot visits to your website.
Include AND exclude bot traffic in Google Analytics
If you’d like to have access to your Analytics data both with and without bot traffic, you can create another View and make sure that one has the Bot Filtering box ticked at all times, while the other always have it unticked.
Doing this is especially useful if you are working to filter out bot traffic, as you can always refer to the unfiltered View to see how much traffic you have or haven’t filtered from your data.
Filter bot data from Google Analytics
To manually remove bot data from Google Analytics, you can set up custom filters. These permanently delete the spammy data, so it’s worth keeping an unfiltered View at all times and creating a secondary View to use with filters.
Again you need to click the cog at the bottom-left to go to Google Analytics Settings, then on the right-hand View menu, click Filters.
Click the Add Filter button and define the various parameters you want to use to exclude bot data from Google Analytics reports for that View.
If you have Account-level permissions, you can also click the All Filters option under the left-hand Account menu, from where you can define new filters and apply them to multiple Views under the same Account.
There’s a huge number of ways to define filters in Google Analytics, so try to find a common characteristic among your bot traffic that is not present in your legitimate website traffic.
Some examples include very short visits to only one page, traffic from an unusual location (such as Kazakhstan) or multiple hits from a spammy referrer URL.
Be as specific as possible – that way there’s less likelihood of legitimate traffic being detected as a false positive by your would-be bot filter.
Hide bot traffic in Google Analytics
A final option if you’re working with past data that already contains bot traffic is to hide bot traffic in Google Analytics temporarily.
Go to a specific report page in your Analytics dashboard, for example:
- Audience > Geo > Location
- Acquisition > All Traffic > Referrals
- Behaviour > Site Content > All Pages
Look for the table of data towards the bottom of the page, and the search box (with the word ‘advanced’ alongside it) at the top-right of the table.
You can type a word into the search box to filter your visible data in real-time to only that which matches the search term, but for more sophisticated filtering of possible bot traffic, click ‘advanced’.
From there you can choose various Google Analytics dimensions or metrics and filter your data by exact match, the beginning or end of the data (useful for URL stubs, for example), data that contains the search term anywhere, or more advanced RegExp matches.
Importantly, this method does not delete any data – it just hides it temporarily. As such, you can use it during testing and remove the search filter to restore the full view of your data.
Whichever method you use, don’t forget to keep at least one view that either does not filter bot traffic at all or uses the default Google Analytics Bot Filter setting.
That way, no matter what happens now or in the future, you should always have access to your complete website data. Doing this will allow you to test new filters, identify new bots or experiment with new website security settings to block out the bad bots.