Stuifbergen.com

Analytics & Growth Hacking

  • About
  • Contact
  • Professional Services

Find me here

  • Mastodon (social)
  • Mastodon (measure)
  • No more Twitter
You are here: Home / Blog / Visualize Snowplow data with Google Data Studio

Visualize Snowplow data with Google Data Studio

27 March 2019 by jules Leave a Comment

In this post, I will show you how to easily visualize any data measured by Snowplow Analytics. I will use performance timing data as an example.

The data I use is stored Big Query, and I will visualize in Google Data Studio using a view that is small and snappy.

See my post on setting up Snowplow Analytics on the Google Cloud Platform on how to get your own pipeline set up.

The result is going to look like this:

Enable performance timing tracking

To track the data, you will first need to enable performance timing data tracking on your site. This works by setting performanceTiming to true in the contexts block of your tracker:

Make a small BigQuery view

Datastudio can be very slow when connecting a large dataset, so we’re going to create a small view.

Sidenote: To make it work even faster, you can create a separate table (since views are basically a query over a larger dataset), but this works really well, since data caching works fine.

Make a query – save to view

In the Big Query UI, you can easily save the results of a query to a view to use in Data Studio. The Query I use:

#standardSQL
select
  pv.app_id,
  format_timestamp('%Y%m%d',pv.derived_tstamp) as day,
  pv.page_urlpath,
  pt.connect_end - pt.connect_start as connect_end,
  pt.dom_loading - pt.connect_start as dom_loading,
  case when pt.dom_interactive > 0 then pt.dom_interactive - pt.connect_start else null end as dom_interactive,
  case when pt.dom_interactive > 0 then pt.dom_interactive - pt.dom_loading else null end as dom_loadtime,
  case when pt.dom_complete > 0 then pt.dom_complete - pt.connect_start else null end as dom_complete,
  case when pt.dom_complete > 0 then pt.dom_complete - pt.dom_loading else null end as dom_completetime,
  case when pt.dom_content_loaded_event_end > 0 then pt.dom_content_loaded_event_end - pt.connect_start else null end as page_loaded
from `PROJECTNAME.DATASETNAME.TABLENAME` as pv , unnest( contexts_org_w3_performance_timing_1_0_0 ) as pt
where
  pt.connect_start is not NULL
and
  event = 'page_view'

In the results tab, you can check and tune your query until you’re happy with the results.

To save query costs, you might want to consider restricting table length by adding more WHERE clauses (on date, or a subset of pages).

Once you’re happy, click the Save view button, to save this as a table in one of your datasets.

That’s it. You now have a view you can use in Data Studio

Connect Data Studio

In Data Studio, in a report, your data source should be immediately visible when clicking Create New Data Source and then choosing Big Query.

The columns in your view are directly usable in your graphs and widgets.

Create graphs and stare at them in awe

Data Studio works quite intuitively. Insert a graph type you like, select day, and the metrics you want to plot, and you’re done.

Make sure to change the aggregation type of the performance metrics from SUM to Average or Median.

That’s it

Done already? Yes! Congratulations, grab a coffee, you deserved it :)

Got tips, remarks? Leave a comment, or tweet me @zjuul – thanks!

Related posts:

Default ThumbnailDoing Cohort Analysis in R (using ggplot) Analyse Web Site Click Paths as Processes Scripted Install: Snowplow Analytics on the Google Cloud Platform Howto: add custom schemas to Snowplow GCE (and BigQuery)

Filed Under: Blog Tagged With: big query, data visualisation, google cloud, snowplow

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Recent Posts

  • How to use a pre-hashed email address in GA4 tags in GTM
  • GA4 intraday exports and cookieless pings
  • July 2024 change in collected_traffic_source in GA4 Big Query
  • srsltid parameter, GA4 and Big Query
  • Smart incremental GA4 tables in Dataform

Need help?

  • Contact me

Search

© Copyright Jules Stuifbergen · Powered by Genesis ·