Skip to main content
(02) 4948 8139 0421 647 317

Understanding GA4 Bot Traffic and Blocking Referral Domains

Business and marketing decisions are only as good as the data that informs them, so keeping Google Analytics 4 (GA4) clean and accurate is critical for online companies.  Unfortunately, bot traffic, which now accounts for roughly half of all internet traffic, is a constant source of misleading data.

While many bots, like Googlebot, are beneficial, “bad bots” engage in scraping, content theft and attacks and because they don’t reflect human user behavior, their visits severely skew key performance indicators.

Identifying and Excluding Bots in GA4

So how can you spot this bot activity? Firstly, look for anomalous behaviour in your reports and identify suspicious patterns in your data, such as:

Behavioural data

  • Short engagement sessions:

Bots often spend seconds on a page, unlike humans. Check the engagement rate (the formerly known bounce rate in Universal Analytics) and average session duration for anomalies.

  • Unrealistic page views

Hundreds of pages viewed in a single session could indicate scraping bots.

  • Unusual interactions

Unnatural scroll patterns, rapid form fills, or visiting multiple pages with no mouse movement are examples of bot-specific interactions.

  • A sudden flow of spam comments

Spam bots main purpose is to land on your website and spread spammy comments. They often promote unrelated products, use flattery language, or include irrelevant links. You can recognise them by the unnatural and generic language.

  • Declined card transactions

Fraudsters usually use stolen or non-existent credit card information. Many attempts to complete a transaction mean they’re trying different variations or the card they’ve stolen has been blocked.

Demographic data

  • Traffic spikes from a single IP address

Botnets or malicious scripts can generate high traffic volumes in a short time span. Sudden, massive increases in web traffic originating from a single IP address are a clear sign of bot traffic.

  • Unlikely origins

Traffic spikes from countries you don’t target or visits classified as ‘Location not set’ might be bots. Unusual sources, such as unfamiliar websites, data centers, or crawlers with unusual user agents, are another sign of suspicious traffic.

  • Unusual traffic sources

A sudden influx of uncommon devices or operating systems could be suspicious and indicate bot traffic. For example, a spike in new user acquisition by a single traffic source is a sign of suspicious activity.

The Problem with Simple Filtering

GA4 automatically excludes known bots and spiders by default, using a combination of Google research and the IAB’s International Spiders and Bots List.

This feature cannot be disabled or adjusted however and you won’t be able to see how much bot traffic was excluded. Also, as this method only excludes bots on the pre-compiled industry list, it leaves out the countless sophisticated, unknown and ever-evolving malicious bots that are skilled at mimicking human behavior.

It’s important to understand this fundamental distinction: removing bots from Google Analytics reports does not remove them from your website.

While your data might look cleaner, the bots are still actively hitting the website, slowing down performance, consuming server resources and potentially carrying out serious attacks like Distributed Denial of Service (DDoS) or Account Takeover (ATO).

Cleaning your analytics data solves a symptom – inaccurate reporting – but to protect your business and customers, you must address the root problem by deploying a dedicated bot management solution that blocks malicious traffic *before* it ever reaches your website, app, or API. True data accuracy is a by product of real security.

Manual Filtering for Specific Traffic

Manual filtering is often time-consuming and unreliable for a high-traffic websites but to do this, there are several ways to further remove non-known bot traffic from your Google Analytics data:

  • Create a Segment or Comparison group

In the Explore tab, create a segment to exclude traffic based on criteria like “User Agent does not contain,” or “IP Address does not equal”. You can also exclude low-engagement traffic by adding conditions based on events or session length.

You can set up a similar group in Comparisons and apply this for the known time period to exclude the bot activity based on known criteria.

  • Use the Referral Exclusion List

Go to Admin > Data Streams > [Your Stream] > Configure tag settings > Show all > List unwanted referrals to add known spam domains that you want to exclude from your data. Of course, this will only work from the date it is set up, and these spam domains will often change so this can be an ongoing task!

Other methods

  • Block at the server level

Identify and block suspicious IP addresses directly on your web server.

  • Implement reCAPTCHA

Use reCAPTCHA on your forms to prevent bot submissions.

 

If you want to know more about how identifying and excluding bots in GA4, together with excluding suspicious domains, could help to improve your business and marketing decisions, please get in touch.