Self-Referrals in Google Analytics
A self-referral in Google Analytics is a session where the source is your own site. For example, on Distilled.net in September 2015, we had a bunch of sessions show up in Google Analytics like this:
This is often ignored or considered innocuous, but it represents something very wrong with the sessions it represents. It doesn’t make sense for a user to have arrived on your site via your site – there must have been an original source, and we want to include every subsequent hit after that original landing as part of one big session, providing the user doesn’t leave and come back via some other channel, or go inactive for a long period of time (neither of which would appear as self-referrals).
Back in June, I wrote a post discussing the various ways in which Google Analytics can split sessions, with highly misleading results, and self-referrals are often a symptom of this. For example, take the two sessions below:
The second session on the right might well describe its source as a self-referral (e.g. arriving on distilled.net via distilled.net), and this is unhelpful because we can no longer attribute its conversion to Google Search, which is how the user actually found the site.
Identifying self-referrals
In the screenshot above, I’ve simply gone to the “Referrals” report in Google Analytics and filtered for my own domain name – in this case, distilled.net. For the purposes of this example, I’ve also filtered down to a time period in September 2015 when I know Distilled had some self-referral issues.
It’s important to take a look at the metrics that are displayed here – in our case, the number of sessions was not hugely significant in the context of the site as a whole, and there were no conversions attributed to the self-referrals, which means that no conversion data was mis-attributed. If the number of sessions was large, or there were conversions in there, then this would represent a high priority problem. Ultimately, this comes down to your organisation’s attitude to data – I think for most organisations, any conversions going awry is going to be a cause for concern, but the threshold for inaccuracy in sessions on their own may vary more.
Diagnosis
Self-referrals can occur in one of two ways:
-
A session can be split in two when moving around your site – for example, because the cookie was stored on blog.example.com, and you moved to payment.example.com.
-
A session started on a page without tracking code, meaning that when a user moved from that page to one with tracking code, the only available referrer was your own site.
In either case, the best place to start is to create a segment only containing self-referrals, like so:
To create the above segment:
-
Click “Add Segment” at the top of the reporting interface
-
Go to “Advanced”>”Conditions” in the segment creation menu
-
Filter for sessions including “Source” “contains” [yourdomainname.tld]
-
Name the segment, and click “Save”
With that segment active, then navigate to the landing page report, and set the secondary dimension as “Full Referrer”:
You can now go through each of the pages in this report, and check for the following:
-
The referring page has no tracking code. In this case, the fix should be simple – add tracking code to that page! If the page does have referring code, you could double check that it’s firing using inspect element.
-
Links between the referring page and the landing page are marked with utm parameters. It’s never a good idea to use these internally, precisely because they cause a loss of attribution data.
-
The landing page doesn’t exist on your primary domain/subdomain, or the referring page is otherwise not on the same domain. In this case, it probably exists on a different subdomain or domain to the rest of your site, so you can dig into this and set up cross domain tracking if needed.
-
Inconsistent analytics implementation between the two pages. For example, check for renamed trackers, different analytics libraries, or different tracking IDs.
In the example above, we had some pages on the Distilled site that had clung on to an outdated version of Google Analytics. The landing page “(not set)” refers to sessions that were made up purely of an event that had fired using this old tracking code.
Discussion
As always, if you have anything to add, whether that be other ways to easily identify self-referrals or quick fixes to stop it happening in the first place, get in touch on Twitter.