So, we all love GA4, and we all have the Big Query export enabled to do limitless analysis. Yay!
But.. there’s some nasty limitations to GA4, still.
Updated 2023/08/23 the page location truncation limit is now lifted to 1000 characters! It was 420. So less of a problem now, but i will keep the post alive here.
One of the trickiest is the truncation of the
page_location parameter to 1000 characters.
Which is mentioned in the limit list now (it wasn’t, before)
page_location, where the full URL of your page views is stored. The dimension that is probably the most used in all of Google Analytics. That one.
(Oh, Read all about GA4 pageviews here).
Here’s what happens:
- the URL of your site is loaded, including all the internal query parameters you use
utmcampaign tagging is sometimes added
- sometimes, a
gclidparameter is added, too
- then, ALL tags on the page, will use the same
page_locationparameter and send it correctly
- somewhere between the collection and the GA interface (graphical, or Big Query interface) the truncation happens
Why is it a problem?
Well.. it depends. If you rely on all query parameters and all URL information being present, it can be very annoying to have the URLs truncated. Especially in the
gclid case, which is almost always at the end of a URL.
How big is the problem? Query time!
Here’s a query that calculates how many of your pages are “at risk” and how many are truncated already.
with locations as ( SELECT (select value.string_value from unnest(event_params) where key = 'page_location') as page, length ((select value.string_value from unnest(event_params) where key = 'page_location')) as chars FROM `myproject.analytics_nnnnn.events_202109*` -- select you own project and date range where event_name = 'page_view' ) select case when chars > 999 then '999-...: truncated' when chars > 900 then '901-999: dangerzone' when chars > 800 then '801-900: watch out' when chars > 400 then '401-800: quite long' when chars > 100 then '100-400: ok' when chars > 20 then '020-099: no worries' when chars is null then '000-000: null, wut?' else '001-019: l33t' end as location_length, count(*) as page_views, round(100* count(*) / sum(count(*)) over(),1)||' %' as pct_of_total, count(distinct page) as unique_locations from locations group by 1 order by 1 desc
myproject.analytics_nnnnn.events_202109* to your own project name and wanted date range.
The output will look something like this:
There’s several workarounds possible. The easiest is to save the parametes you NEED into an event parameter
- create a GTM variable of Type: URL, and Component: Query
- use that in the Configuration tag
But that is tricky, and goes agains my own principle: Do not mess with the data collection itself. Use the data processing for modifications..
Also, Google: Give us the full page_location please. We’ll pay for the storage, ok? Thanks.
Have a nice day, and happy data collecting!