Analyze AB tests in GA4 via Big Query

A question that pops up a lot is: I’ve added experiment events in GA4, now what? How can I tie these events to conversions?

This question is rather fundamental when you rephrase it like this:

I want to know…

when a certain event happened for a user
what happens next, and how often?

In the GA4 interface, you can create an explore report of type Funnel and you will have a convenient user interface to click to an answer.
But.. if you want un-modeled, un-thresholded and exactly counted data, you will need to resort to Big Query for an answer, and you will need SQL.

Here’s a query that will help you get started answering that question:

The basic and effective approach

The simplest approach works like this:

Give me all users that were in the experiment (or: where event X happened), along with the following information:
- the timestamp: when did they become part of the experiment
- the variant: which treatment did they receive
Give me all users that converted (or: where event Y happened), along with the following information:
- the timestamp of the conversion event
- optionally: the conversion value (e.g. revenue)
Compare the users from group 1 and 2
- the user needs to be the same (obviously)
- the conversion timestamp needs to be after the experiment
Aggregate the results into something that can be entered into a calculator
- sample ratio mismatch calculator
- significance calculator
- an R or Python notebook

When you nail this approach into a query, there’s some depth you can add. Like

Breakdown by device
Removing outliers (super large orders, people who convert 100 times, etc)
adding multiple conversions, even multiple steps

But, let’s KISS – keep it simple, stupid and look at how this translates into SQL:

The SQL

In order to make the query flexible, I’m adding 3 flexible variables. Two for dates, and 1 for experiment ID.

The assumption is that the experiment ID is in an event parameter of type STRING, called ‘exp_id’, and the variants are in a STING parameter named ‘exp_variant’.

declare start_date STRING default '20230621';   -- YYYYMMDD: when did the experiment start
declare end_date   STRING default '20230626';   -- YYYYMMDD: and end

declare experiment_id STRING default 'Experiment 001';  -- the exp_id parameter you want to filter

Let’s start the query with (see above – step 1) – get met the users that are in the experiment:

with in_experiment as (
  SELECT
    user_pseudo_id as cookie,

    if(
      count(distinct (select value.string_value from unnest(event_params) where key = 'exp_variant')) > 1,
      '_multiple_variations_seen',
      max((select value.string_value from unnest(event_params) where key = 'exp_variant'))
    ) as variant,
    min(event_timestamp) as first_exposure
  FROM `production-proj.analytics_12345.events_*`
  where _table_suffix between start_date and end_date
  and (select value.string_value from unnest(event_params) where key = 'exp_id') like experiment_id
  group by 1
)

This will output 1 row per user (cookie) that was part of the experiment.

Note the variant line: this checks if the user has been exposed to multiple variations. If that is the case, we assign the “_multiple_variations_seen” value to that user. Don’t count them twice, don’t include those users in the results.

Now, move on to step 2 and 3:

start with the users in the experiment
add conversion columns via a left join, based on
- same user
- conversion happened on a later point in time

, raw_results as (
  SELECT
    exp.cookie,
    exp.variant,
    
    countif(event_name = 'add_to_cart') as add_to_carts,
    countif(event_name = 'purchase') as purchases,
    max(if(event_name = 'add_to_cart', 1, 0)) as add_to_cart_users,
    max(if(event_name = 'purchase', 1, 0)) as purchase_users,

  from in_experiment as exp
  left join `production-proj.analytics_12345.events_*` as events
    on exp.cookie = events.user_pseudo_id
    and exp.first_exposure <= events.event_timestamp
  where events._table_suffix between start_date and end_date
  group by 1,2
)

You see I count 2 conversion events: add_to_cart and purchase.

And I count them twice: a raw counter (the number of times per user) and a user counter (either 1 or 0 per user).

The join condition is restricted to “same user” and “conversion event must be later than the experiment exposure”.

Summarizing it together

The next bit is just presenting everything in a nice table:

The summary CTE aggregates the counters per variant.
The final query adds rates and does some nice formatting

, summary as (
  select
    variant,
    
    count(distinct cookie) as users,
    sum(add_to_carts) as add_to_carts,
    sum(add_to_cart_users) as add_to_cart_users,
    sum(purchases) as purchases,
    sum(purchase_users) as purchase_users,
  from raw_results
  group by 1
)

select
  *,
  round(100 * add_to_cart_users / users, 2) || '%' as add_to_cart_rate,
  round(100 * purchase_users / users, 2) || '%' as purchase_rate,
from summary
order by variant

The Result

A sample result can look like this:

The metrics ending with “users” can be plugged into a statistical calculator (for a simple binomial yes/no) result.

And of course, you can use the “raw_results” CTE output to read into a notebook or raw results calculator.

What’s next?

Using this query, you will have a starting point to summarize you AB test results quickly and easily.
When you know what you’re doing, you can start adding drilldowns, custom conversions, etc.

Can you even be more clever? For sure! Using a WINDOW function in stead of a JOIN will open some more doors to tweak your result set (conversion windows come to mind).

For now: thank you for reading. If you have found this useful: please share, comment, spread the word!
If you want to know more: you know where to find me :)

Analyze AB tests in GA4 via Big Query

The basic and effective approach

The SQL

Summarizing it together

The Result

What’s next?

Recent Posts

Need help?

Search

Find me here

The basic and effective approach

The SQL

Summarizing it together

The Result

What’s next?

Related posts:

Leave a Reply Cancel reply

Recent Posts

Need help?

Search