Data Journal | All Posts

Advanced Google Analytics 4 Dashboard Development in Looker Studio Using BigQuery

February 19, 2025

## Introduction This tutorial delves into the advanced implementation of a Google Analytics 4 (GA4) dashboard in Looker Studio, leveraging BigQuery as the primary data source. By the end of this guide, you will have constructed a robust acquisition dashboard to facilitate data-driven decision-making for stakeholders. If you are not yet familiar with GA4’s data export schema in BigQuery, it is highly advisable to review the official [Google Analytics 4 export schema documentation](https://support.google.com/analytics/answer/7029846?hl=en). This documentation provides crucial insights into column structures, expected values, and nested fields. For instance, the `event_params` column consists of an array of records. Each record contains a `key` (parameter name) and a `value`, which is a struct that supports multiple data types such as `string_value`, `int_value`, and `double_value`. Here is the [dashboard](https://lookerstudio.google.com/u/0/reporting/fb0c939c-dd51-4ac7-bc1f-f0adcff1edfd/page/rVYDD) that we will be constructing. --- ## Filters Ribbon Filters enhance dashboard interactivity by enabling refined data views based on acquisition attributes. The primary filters included in this dashboard are: - **Acquired Campaign** - **Acquired Source** - **Acquired Medium** - **Date Range Picker** ### Available Dimensions and Metrics Below is a comprehensive list of dimensions and metrics accessible via the BigQuery connector: | Dimension | Type | |-----------|------| | Acquired Campaign | Text | | Acquired Medium | Text | | Acquired Source | Text | | Advertising ID | Text | | App ID | Text | | App Install Source | Text | | App Version | Text | | City | Geo - City | | Continent | Geo - Continent | | Conversion | Boolean | | Country | Geo - Country | | Days Since First Touch | Number | | Device Category | Text | | Device Model (OS) | Text | | Event Date | Date | | Event Name | Text | | Event Param Value | Number | | Event Param Value (String) | Text | | Event Previous Time | Date & Time | | Event Time | Date & Time | | Event Time Offset | Number | | Event Value (USD) | Currency (USD - US Dollar ($)) | | Firebase App ID | Text | | First Touch Time | Date & Time | | Language | Text | | Limit Ad Tracking Enabled | Text | | LTV Currency | Text | | LTV Revenue | Currency (USD - US Dollar ($)) | | Mobile Brand Name | Text | | Mobile Model Name | Text | | Mobile Marketing Name | Text | | OS | Text | | OS Version | Text | | Platform | Text | | Region | Country subdivision (1st level) | | Stream ID | Text | | Time Zone Offset | Number | | User ID | Text | | User Property Name | Text | | User Property Value (Double) | Number | | User Property Value (Integer) | Number | | User Property Value (String) | Text | | User Pseudo ID | Text | | Event Count | Number | | Unique Users | Number | ## User Overview Section This section consists of: - **Three scorecards:** Total Users, Active Users, and New Users - **A time series chart** displaying these metrics over time ### Total Users - **Metric:** Unique Users - **To display percentage change**, enable the **Comparison Date Range** option and set it to **Previous Period**. ### Active Users Since July 2023, Google Analytics 4 introduced the `is_active_user` field in BigQuery exports. If you are using this newer schema, simply apply a distinct count on `user_pseudo_id`. Otherwise, use the following filter combination: **Filter Conditions:** - `Event Param Name = engagement_time_msec` AND `Event Param Value > 0` - OR `Event Param Name = session_engaged` AND `Event Param Value = 1` **Metric:** COUNT DISTINCT on `user_pseudo_id` ### New Users - **Metric:** Unique Users - **Filter:** `Event Name = first_visit` ### Time Series Chart for Users Since this chart contains multiple metrics, a **blended data source** is required, consisting of three tables: - **Table 1:** Total Users (Unique Users) - **Table 2:** Active Users (using the filters mentioned above) - **Table 3:** New Users (using `first_visit` event filter) **Join key:** Event Date **Join type:** Left Outer ## Sessions Overview Section This section contains: - **Three scorecards:** Sessions, Engaged Sessions, and Engagement Rate - **A time series chart** ### Sessions - **Metric:** Event Count - **Filter:** `Event Name = session_start` ### Engaged Sessions - **Metric:** Event Count - **Filter:** `Event Param Name = session_engaged` AND `Event Param Value = 1` ### Engagement Rate - **Formula:** Engaged Sessions / Sessions - Use blended data sources with a **cross join** (no join key required). ## Events Overview This section presents a tabular view of events and their respective event counts. - **Dimension:** Event Name - **Metric:** Event Count ## Pages Overview This section analyzes **page views per page title**. **Filters:** - `Event Name = page_view` - `Event Param Name = page_title` **Metrics & Dimensions:** - **Dimension:** `Event Param Value (String)` → Rename to **Page Title** - **Metric:** `Event Count` → Rename to **Views** ## Performance Overview This section visualizes key performance indicators across three charts: ### Sessions by Country - **Chart Type:** Geo Chart - **Geo Dimension:** Country - **Metric:** Event Count - **Filter:** `Event Name = session_start` - **Zoom Area:** World ### Top 5 Countries (Sessions) - **Chart Type:** Bar Chart - **Data Source:** Duplicate *Sessions by Country* - **Limit Bars to:** 5 (Descending order by Sessions) ### Sessions by Source / Medium - **Metric:** Sessions (Event Count with `session_start` filter) - **Custom Dimension:** `CONCAT(Acquired Source, " / ", Acquired Medium)` ## Conclusion By following this structured approach, you have developed a dynamic acquisition dashboard in Looker Studio powered by BigQuery. Mastering the manipulation of Google Analytics 4 data in SQL enables advanced analysis, custom metrics development, and deeper insights into user behavior. Continue exploring the dataset to refine your dashboard further and extract meaningful business intelligence.

How to Configure Data Redaction in Google Analytics 4

October 3, 2024

Data redaction is a feature in Google Analytics 4 that helps you protect your property ensuring, to an extent, that no Personally Identifiable Information (PII) is collected. Configuring this feature in Google Analytics 4 is essential for adding an extra layer of security to your data collection efforts. ### What is a URL query parameter? A URL query parameter is a key-value pair that follows the question mark (?) or (&) at the URL. For example, the event parameter, **form_destination** might contain a URL with query parameters such as `https://www.example.com/us?language=english&catalog=summer24` ## How does data redaction work? This feature uses text patterns to identify likely to be email addresses in all event parameters and URL query parameters that are included as part of the event parameters `page_location`, `page_referrer`, `page_path`, `link_url`, `video_url`, `form_destination` Data redaction evaluates events before they are collected to find and redact (remove) any text it understands as an email or query parameter key-value pair. After the redaction of the text, data collection proceeds as expected. Data redaction evaluates events before they are collected to find and redact (remove) any text it understands as an email or query parameter key-value pair. After the redaction of the text, data collection proceeds as expected. Again, while this is a powerful tool preventing you from sending or collecting PII inadvertently, the ultimate responsibility falls on you, the entity collecting the data. Please ensure that you are meeting all regulatory requirements for data collection, and do not rely on data redaction as your primary defence against sending or collecting PII. As mentioned before, this feature should be another tool in your arsenal and not your main tool. ### Is data redaction activated by default for Google Analytics 4? Any property created after the release of this feature will have data redaction for email on by default. Data redaction for URL query parameters will still have to be configured separately. If your property was created before the release of this feature, you will need to configure data redaction for both email addresses and URL query parameters. <aside> 💡 **Please note that data redaction is only available for web data streams.** </aside> ### Data Redaction - Good to Know - Data redaction evaluates event data for email addresses on a best-effort basis - Data redaction occurs client side after Analytics modifies or creates events and **before** data is sent to Analytics - Data redaction accepts recent-encoded URL query parameters, including Unicode characters accepted by browsers - Data redaction may incorrectly interpret text as as an email address and redact the text. For instance, if the text contains `@` followed by a top-level domain name it may be incorrectly removed - Data redaction does not evaluate HTTP-header value - Data redaction won’t prevent the collection of PII via Measurement Protocol or Data Import ## How to configure data redaction in Google Analytics 4? To configure data redaction in Google Analytics 4, you can follow these steps: 1. In the Admin section, under *Data collection and modification*, click **Data Streams** 2. Select the relevant web data steam 3. Click **Redact data** 4. Turn on the switch for each option you want to activate data redaction for 1. Email 2. URL query parameters 5. If you chose to active data redaction for URL query parameters, enter a comma delimited list 1. For example: first_name, last_name, email, address, phone_number 6. The last step is to test URL query parameters data redaction using the **Test data redaction** feature to see how Analytics removes the data in question 1. Under the data in the text field. Make sure it contains URL query parameters you entered in the previous step 1. https://www.example.com/?firstname=John&lastname=Doe&email_address=johandoe@example.com 2. Click preview redacted data 1. [https://www.example.com/?firstname=(redacted)&lastname=(redacted)&email_address=(redacted)](https://www.example.com/?firstname=(redacted)&lastname=(redacted)&email_address=(redacted)) If you are satisfied with the result, then you are done. Otherwise, change the URL query parameters until you get the intended result. ## Conclusion Data redaction is a very powerful feature that allows you to protect your property from collecting PII inadvertently. Combining this with Consent Mode can ensure that your business is compliant with data regulations and that no data is being passed while it should not.

Data Visualization with Open Source Web Analytics

February 24, 2024

In our [previous discussion](/swetrix), we explored Swetrix, an innovative, open-source web analytics tool beneficial for businesses seeking to evaluate their online performance. This chapter` will guide you on leveraging Swetrix data to create a personalized dashboard, providing essential insights for website optimization. For those interested, the complete code is accessible on [Github](https://github.com/AzizDhaouadi/swetrix-plotly), allowing you to clone the project and tailor your dashboard. This tutorial employs Express, JavaScript, and Plotly.js to craft a dashboard showcasing data such as pageviews, unique visitors, session durations, bounce rates, average page speeds, and more, offering a comprehensive view of your website's performance. The flexibility of the project means you can adjust the dashboard's design to meet your specific needs. For this tutorial, we will be using Express, JavaScript and Plotly.js to build a dashboard that shows the following data: - Pageviews - Unique visitors - Session duration - Bounce rate - Average page speed - Traffic per country - Unique visitors Tech Breakdown - Unique visitors per browser - Unique visitors per Device Category - Unique Visitors per OS - Visits per page - Visits per referrer - Custom events breakdown As you can see this dashboard contains all the necessary information for us to make decisions about the performance of our website. And the good news is we can make this dashboard look and feel the way we want as we have full control over the code. ## Project Anatomy The first thing we need to get started is a Swetrix account with data collected in order to visualize it. If you need to create a Swetrix account, you can so using their [sign up page](https://swetrix.com/signup). If it's your first time using this tool, you can refer to their [official documentation](https://docs.swetrix.com/). Once you are setup, it's time to get started. ### Express, Plotly.js, and Dependencies As mentioned above, the main components of this project are Express, Plotly.js and JavaScript. If you do note have Node.js installed on your machine, please do so. You will this in the upcoming steps. The following are the dependencies we need to get things going: - Express - Node-fetch - Dotenv You do not have to worry about installing these if you are going to clone the project. The Readme file contains all the details you need to get started. The point of explaining these dependencies is to understand their role in the project, so everything is clear. ## Swetrix's Statistics API A core aspect of this project is the Swetrix Statistics API, which provides access to the data displayed in the Swetrix dashboard. We will use specific API endpoints (/log, /log/birdseye, /log/performance) to fetch necessary data for our dashboard. Familiarity with the Fetch API and asynchronous JavaScript will be beneficial here.. If you are not familiar with the fetch API, you can read more about it in depth in the [MDN Web Docs](https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API). Additionally, being familiar with Async JavaScript would be a plus. ## API Endpoints To retrieve the data we need, we will be using 3 endpoints being: - /log - /log/birdseye - /log/performance Each endpoint will return to us a specific set of data that we will use to create our charts. ### Log Endpoint This endpoint will return all the data we need about our traffic. In truth, most of the charts mentioned above will be built using the data from this endpoint. ### Log Birdseye Endpoint This endpoint will return the traffic overview data. You can this of it as an executive summary of the log endpoint data set. ### Log Performance API This endpoint returns data regarding the performance of our website. For instance, the average page load speed time is one of the data points that can be found querying this endpoint. ## Implementation Details ### Index.js File Now that we understand what our data source is and what our dependencies are, it's time to dive into the source code. The first file we are going to take a look at is the index.js file. The latter is the point of entry to our application and contains the code to tell our server what to do. The code in this file is setting essential variables we need for our application, and defines API routes that fetch data from the Swetrix Statistics API when queried. Let's take a look at the fetch variable. ```js const fetch = (...args) => import("node-fetch").then(({ default: fetch }) => fetch(...args)); ``` One of the project's dependencies is Node-fetch. This is a lightweight module that brings the Fetch API to Node.js. You can read more about this package on [NPM's documentation](https://www.npmjs.com/package/node-fetch). #### app.get('fetch/\*') The index.js file contains multiple app.get functions with routes that look like fetch/\*. These functions tell our server what to do when a get request at the given route is called. For instance, when the server gets a get request at the route /fetch/trafficData it will send a get request to Swetrix's log API endpoint to fetch traffic data. Each app.get function is designed to get specific data from Swetrix's servers. The reason we are defining these functions instead of using the fetch API client-side is to avoid exposing the API key. In fact, client-side fetch requests will be sent to our designated endpoints which in turn will fetch the required data. The dotenv package helps use define and use environment variables in our project so that sensitive information like API keys are not exposed. If you are going to use environment variables ensure to add the .env file to the add .gitignore file so that you do not commit your data to a repository. #### app.use() In our Express application, using `app.use(express.static(path.join(__dirname, "/public")));` is a good way to serve static files such as images, CSS files, and JavaScript files. This line tells Express to serve the static files from a directory named public located in the root of your application. The path.join(\_\_dirname, "/public") part constructs an absolute path to the public directory, which helps avoid issues related to the difference in file paths across operating systems. ## Index.html The index.html file has a very straightforward structure. Note that we are using Boostrap for this project to make styling and responsiveness much easier to handle. Most of the HTML elements in this file will have their content added/updated with JavaScript. ## JS folder In our JS folder, you will see two subfolders and 3 files. Each file starts with render and the rest of the name describes what it will be rendering. For instance, the file `renderTrafficData` is going to be used to render traffic data. Let's breakdown the anatomy of the file in order to get an idea of how data is being rendered. ### renderTrafficData.js The file contains 7 import statements. The import statements are getting us the functions necessary to render the charts using Plotly.js. The next thing we see in the file is the`fetchTrafficData` which will be used to fetch the data we are going to pass to our visualization functions. The fetch request is actually sent to our endpoints. The second function uses the data we fetched to visualize the data using Plotly. All the Plotly-related code is in the dashboardVisualizations folder. The code was kept separate to make readability easier. ## Conclusion This tutorial provides a comprehensive guide to building a custom dashboard using Swetrix data. By understanding the project's structure and implementation, you can adapt and extend your dashboard, integrating web analytics with other tools for richer insights.

Set up Enhanced Conversions for Google Ads with Google Tag Manager

September 15, 2024

As tracking restrictions become tighter and tighter, enabling ad platforms with First-Party data has never been more important. In this tutorial, we will discuss how you can implement Google Ads' Enhanced Conversions with Google Tag Manager to provide Google Ads with the right data for your campaign optimizations. ## What is Enhanced Conversions? Enhanced conversions is a feature that aims at improving the accuracy of conversion measurement. The feature supplements any existing conversion tags by sending hashed first-party data collected from the website while being privacy compliant. Enhanced Conversions uses a secure one-way hashing algorithm known as `SHA256` on your data before sending to Google. This ensures that customer data such as email addresses are protected. This hashed data is later matched with signed-in Google accounts to attribute your conversion to ad events, such as clicks or views. <aside style="background: #f1f1ef; width: 100%; border-radius: 4px; padding: 12px 16px 12px 12px"> 💡 Conversions measured by importing <a href="https://support.google.com/google-ads/answer/2401634" target="_blank">Google Analytics</a> goals aren't supported for enhanced conversions. If you'd like to use enhanced conversions, consider setting up a new Google Ads conversion action with the Google tag or Google Tag Manager. </aside> ## Implementation Pre-requisites Before we dive into the implementation, it is recommended to review this list to ensure that the implementation goes as smoothly as possible: - Know where the conversion tag fires. What is the conversion page URL? This can be a confirmation page URL. - What is the conversion event trigger? How does the conversion tag fire? Is it a button click? Or is it a form submission? Or is it on a page view? - Make sure that there is first-party customer data available on the conversion page. The available data should be email, full name and home address and/or phone number - How is the conversion tracking set up? This setup requires an understanding of the conversion tracking setup and possibly code changes. So, make sure you have access to the code base or you have access to your developers to communicate requirements - If your conversion tags are set using URLs, enhanced conversions can only be set using Javascript or CSS selectors options or automatic enhanced conversions - The impact results in the conversion action table willƒ be seen about 30 days after implementing enhanced conversions successfully. ## Turning on Enhanced Conversions for Web <aside style="background: #f1f1ef; width: 100%; border-radius: 4px; padding: 12px 16px 12px 12px"> 🚨 The following instructions are based on the new design of the Google Ads user experience. </aside> 1. In your Google Ads account, click the **Goals** icon . 2. Click the **Conversions** drop down in the section menu. 3. Click **Settings**. 4. Expand the “Enhanced conversions” section. 5. Check “Turn on enhanced conversions for web”. 6. Review compliance statement. To enable enhanced conversions you must confirm that you'll comply with our [policies](https://support.google.com/adspolicy/answer/7475709) and that the [Google Ads Data Processing Terms](https://privacy.google.com/businesses/processorterms/) apply to your use of enhanced conversions. Click **Agree** **and continue** to acknowledge your acceptance of these terms. 7. Click the dropdown to choose a method for setting up and managing user-provided data. Select “Google Tag Manager”. <aside style="background: #f1f1ef; width: 100%; border-radius: 4px; padding: 12px 16px 12px 12px"> 🚨 Note: If Google enhanced conversions data is collected in a way that’s different from Google Tag Manager, that data may not be processed. For instance, since we selected that we are going to send enhanced conversion data through Google Tag Manager and we try sending data through the Google Ads API, the data sent through API won’t be processed. </aside> 8. Click Go to **Google Tag Manager** and follow the instructions to complete the process in Google Tag Manager ## Setting up Enhanced Conversions in Google Tag Manager There are 3 ways we can set up enhanced conversion in Google Tag Manager, and we will review them separately so you can choose the method that suits you best. - **Automatic collection**: This method allows the tag to automatically detect and collect user-provided data on the page. This method is the easiest and quickest out of the three, but it is not as reliable as adding a code snippet or specifying CSS selectors or Javascript variables. - **Code**: Add a code snippet on the website that sends hashed user-provided data for matching. This method is the most reliable as it ensures that you are consistently sending properly formatted data whenever the conversion tag fires. This method is the most suitable if you wish to enhance the accuracy of enhanced conversions. - **Manual configuration**: Specify which CSS selectors or Javascript variable on the conversion page contain the relevant user-provided data. This method is more precise than the automatic collection, but it is not as reliable or consistent as adding code to the website. This method is not recommended is you frequently change the website’s code. The latter can create inconsistencies that may break enhanced conversion tracking. If you do consistently update your code base, it is recommended that you use dataLayer variable or add code to your website to capture user-provided data. To reiterate, if you frequently change the website formatting, this method is not recommended it could disrupt the CSS selector method event the Javascript method if you are creating your variables based on `querySelector` or `querySelectorAll` <aside style="background: #f1f1ef; width: 100%; border-radius: 4px; padding: 12px 16px 12px 12px"> 💡 It is best practice to use the ID or data attributes to retrieve the value from a DOM element. IDs are unique and less likely to change over time. Data attributes are also less prone to change over time. Both of these attributes change less over time unlike properties such classes. </aside> ### Set up enhanced conversions using automatic collection in Google Tag Manager Automatic collection can be set up using 2 different methods: - **Standard automatic enhanced conversions**: this is the recommended method when user-provided data is available on the conversion event page. For instance, if your conversion event page is a purchase confirmation page and data present on the page includes the user details such as email, phone number or address, use this method. - **Automatic enhanced conversions with the user-provided data event tag**: this is the recommended method when the customer data is not available on the conversion event page, but it is instead available on a previous page. For instance, if the conversion event page is the purchase confirmation page and user’s data is entered on a previous page prior to the confirmation page, use this method. <aside style="background: #f1f1ef; width: 100%; border-radius: 4px; padding: 12px 16px 12px 12px"> Note: If you use the user-provided data event tag method, first-party customer data will be automatically detected on pages that users visit before reaching the conversion page. With this setup, you authorize Google to use an ads cookie to collect hashed, first-party customer data on your behalf and to connect that data with subsequent conversion events occurring within the same user session on your behalf. All data not connected to a conversion is deleted. If you implement Consent Mode, the ads cookie will be subjected to the ad_storage consent status of the Consent Mode feature, where implemented. </aside> ### Set up standard automatic enhanced conversions 1. Click **Workspace**, then click **Tags** from the navigation menu. 2. Select the Google Ads conversion tracking tag that you’d like to implement enhanced conversions with and edit that tag. - Make sure that this conversion action has the same conversion tracking ID and label as the conversion action that you enabled enhanced conversions for in your Google Ads account. 3. Click **Include user-provided data from your website**. 4. In the dropdown, select **New Variable** or use an existing variable if you’ve already set one up. 5. Select **Automatic collection**. 6. Click **Save** for the variable and then save the conversion tracking tag. ### Set up automatic enhanced conversions with user-provided data event tag 1. Click **Workspace**, then click **Tags** from the navigation menu. 2. Click **New** to create a new tag. 3. Click **Tag Configuration** and select **Google Ads User-Provided Data Event**. 4. Fill in your Google Ads Conversion Tracking ID. - Make sure that this conversion action has the same conversion tracking ID and label as the conversion action that you enabled enhanced conversions for in your Google Ads account. 5. In the dropdown, select **New Variable**. 6. Select **Automatic**. 7. Name the variable. 8. Click **Save**. 9. Click **Triggering** in the Google Ads User-Provided Data Event Tag. This is where you anticipate that the user data will be available. 10. Click the plus icon. 11. Click **Trigger Configuration**. 12. Click **Form Submission**. 13. Select **All Forms**. 14. Select **Save** and then save your new Google Ads User-Provided Data Event tag. <aside style="background: #f1f1ef; width: 100%; border-radius: 4px; padding: 12px 16px 12px 12px"> Note: You must select form submission for enhanced conversions to work properly </aside> ### Set up conversions using manual configuration in Google Tag Manager The first step in this method is to find which variables we are going to use to set up enhanced conversions. 1. Navigate to the page where user-provided data appears. This data may appear on the conversion page or on a previous page. Identify any customer data that is displayed on the page you want to send to Google. At least one of the following fields must be provided: 1. Email (preferred) 2. Address - First name, last name, postal code, and country are required 1. Optional: Street address, city, and region as additional match keys 3. A phone number can also be provided as a standalone match key but is recommended to be sent along with an email 2. Once the customer data on the page is identified, you will need to copy the CSS selectors and enter those in Google Tag Manager. You can also data layer variables if they already exist. This allows enhanced conversion tags to know which pieces of data to hash and send to Google. It’s important to keep the conversion page open. 3. In a separate tab, open Google Tag Manager to set up manual enhanced conversions. This can be done in 2 different ways. 1. **Standard manual enhanced conversions**: Use this method when user-provided data is available on the conversion event page. 2. **Manual enhanced conversions with user-provided data event tag**: Use this method when user-provided data is not available on the conversion event page but is instead available on a previous page <aside style="background: #f1f1ef; width: 100%; border-radius: 4px; padding: 12px 16px 12px 12px"> Note: If you use the user-provided data event tag method, first-party customer data will be automatically detected on pages that users visit before reaching the conversion page. With this setup, you authorize Google to use an ads cookie to collect hashed, first-party customer data on your behalf and to connect that data with subsequent conversion events occurring within the same user session on your behalf. All data that is not connected to a conversion is deleted. If you implement Consent Mode, the ads cookie will be subjected to the ad_storage consent status of the Consent Mode feature, where implemented. </aside> ### Set up standard manual enhanced conversions 1. Click **Workspace**, then click **Tags** from the navigation menu. 2. Select the Google Ads conversion tracking tag that you’d like to implement enhanced conversions with and edit that tag. - Make sure that this conversion action has the same conversion tracking ID and label as the conversion action that you enabled enhanced conversions for in your Google Ads account. 3. Click **Include user-provided data from your website**. 4. In the dropdown, select **New Variable** or use an existing variable if you’ve already set one up. 5. Select **Manual configuration**. - You may also select “Code” if you'd like to use Custom Javascript or other data objects to send your data through Google Tag Manager. This method requires data to be formatted in a particular way, so if you’d like to do this, you can read the “Code” instructions below. Here you'll see “User provided data” at the top of the page, followed by all the pieces of customer data which you can include as part of your enhanced conversion tag. 6. For the relevant user data field that you'd like to provide via enhanced conversions, click on the dropdown menu and select **New Variable**. <aside style="background: #f1f1ef; width: 100%; border-radius: 4px; padding: 12px 16px 12px 12px"> Note: if you already have unhashed variables in the data layer, you can select those instead of creating new variable. </aside> 1. In the “Variable Configuration” screen, select **Choose a variable type to begin setup**. In the “Choose Variable Type” screen, select **DOM Element**. 2. Back on the “Variable Configuration” screen, change “Selection Method” in the dropdown to “CSS Selector.” 3. Give your variable a title. 4. Enter the CSS selector that references your users’ data into the “Element selector” input field (see section below on how to find the CSS Selector). You can leave the “Attribute name” field blank (more on this below). 5. Click **Save** for the variable and then save the conversion tracking tag. ### Set up manual enhanced conversions with the user-provided data event tag 1. Click **New** to create a new tag. 2. Click **Tag Configuration** and select **Google Ads User-Provided Data Event**. 3. Fill in your Google Ads Conversion Tracking ID. - Make sure that this conversion action has the same conversion tracking ID and label as the conversion action that you enabled enhanced conversions for in your Google Ads account. 4. In the dropdown, select **New Variable**. 5. Select **Manual configuration**. 6. For the relevant user data field that you'd like to provide via enhanced conversions, click on the dropdown menu and select **New Variable**. 7. In the “Variable Configuration” screen, select **Choose a variable type to begin setup**. In the “Choose Variable Type” screen, select **DOM Element**. 8. Back on the “Variable Configuration” screen, change “Selection Method” in the dropdown to “CSS Selector.” 9. Give your variable a title. 10. Enter the CSS selector that references your users’ data into the “Element selector” input field (see section below on how to find the CSS Selector). You can leave the “Attribute name” field blank. (more on this below). 11. Click **Save**. 12. Click **Triggering** in the Google Ads User-Provided Data Event Tag. This is where you anticipate that the user data will be available. 13. Click the plus icon. 14. Click **Trigger Configuration**. 15. Click **Form Submission**. 16. Select **All Forms** 17. Select **Save** and then save your new Google Ads User-Provided Data Event tag <aside style="background: #f1f1ef; width: 100%; border-radius: 4px; padding: 12px 16px 12px 12px"> Note: You must select form submission for enhanced conversions to work properly </aside> ### Identifying CSS Selectors and inputing them in Google Tag Manager This section is for readers that are not familiar with CSS Selectors. If you are, you can skip ahead. This section will show you how to copy CSS Selectors from you conversion event page and paste them into enhanced conversions variables. If you already have data layer variables that capture the relevant data, you can use those instead of creating new variables. 1. Navigate to your website in a separate and keep (or open) a Google Tag Manager page 2. Identify the user-provided data you want to send with enhanced conversions. This data may be on the conversion page or on a previous page 3. Use your mouse to right-click on top of it and select **Inspect** 4. You will see the **Developer** **Tools** launch within your browser 5. Within the source code presented in the Developer Tools page, you will see highlighted code. This code is the page element where you need to extract CSS Selectors for the customer data your right-clicked 6. Depending on your browser, select **Copy Selector** 7. In the other tab, with Google Tag Manager open, paste that text in the **Element selector** field 1. For reference, it should look something similar to but not exactly like this `tsf > div:nth-child(2) > div.A8SBwf > div.RNNXgb > div > div.a4bIc > custEmail` 8. Click **Save** 9. Repeat the steps above for all the user-provided data you want to send with enhanced conversions 10. In Google Tag Manager, click **Save** ## Set up enhanced conversions using Code in Google Tag Manager ### Identify and define enhanced conversions variables In this step, the goal is to make sure the variables needed like email, address are available on the conversion page where the Google Ads conversion tag fires. If you are unsure which page this is, contact your developer. Once you have confirmed which information is available on the conversion event page, you will need to store this data inside a global Javascript variables. You can either send unhashed data, which Google will normalize and hash before the data reaches the servers. Or, you can normalize and hash data yourself. ### Hashing and normalizing data If you decide to normalize and hash the data yourself, follow these instructions: ### Normalization: - Remove leading or trailing whitespaces - Convert the text to lowercase - Format phone numbers according to the [E.164 strandard](https://en.wikipedia.org/wiki/E.164) ### Hashing - Use hex [SHA256](https://support.google.com/google-ads/answer/9004655) The following table lists more information about customer data variables. You can name the variables as you prefer and use the `Key Name` column to know how said variable should be referenced in the enhanced conversions tag. Please note, all data shoud be passed as `String` type variable. | **Data Field** | **Key Name** | **Description** | | --- | --- | --- | | Email address | `email` | User email. | Email address | `sha256_email_address` | Hashed user email. | Phone number | `phone_number` | User phone number. Must be in [E.164 format](https://en.wikipedia.org/wiki/E.164), which means it must be 11 to 15 digits including a plus sign (+) prefix and country code with no dashes, parentheses, or spaces. | Phone number | `sha256_phone_number` | Hashed user phone number. | First name | `address.first_name` | User first name. | First name | `address.sha256_first_name` | Hashed user first name. | Surname | `address.last_name` | User last name. | Surname | `address.sha256_last_name` | Hashed user last name. | Street address | `address.street` | User street address. Example: '123 New Rd' | | City | `address.city` | User city name. Example: `Southampton’ | | Region | `address.region` | User province, state, or region. Example: `Hampshire’ | | Postal code | `address.postal_code` | User post code. Example: 'SO99 9XX' | | Country | `address.country` | User country code. Example: 'UK'. Use 2-letter country codes, per the ISO 3166-1 alpha-2 standard. | The next steps will be to enable enhanced conversions in Google Tag Manager and reference the customer data variables. ### Enable enhanced conversions in Google Tag Manager 1. Sign in to Google Tag Manager, click **Workspace** then click **Tags** from the navigation menu 2. Select the Google Ads conversion tracking tag and edit the tag 1. If you have not set up your Google Ads conversion tracking tag, you can read more about that in [Google Ads conversions](https://support.google.com/tagmanager/answer/6105160) 2. Make sure this conversion action has the same conversion tracking ID and label as the conversion as the conversion action you enabled enhanced conversions for in Google Ads 3. Click **Include user-provided data from your website** 4. Click **Select user-provided data variable**, the select **New Variable** 5. In the new `User Provided data variable` , select **Code** at the bottom 6. Under `Choose Variable Type` , select **Custom Javascript** 7. Copy the following code in the custom Javascript variable ```javascript function () { return { "email": yourEmailVariable , // replace yourEmailVariable with variable name that captures your user’s email "phone_number": yourPhoneVariable , // repeat for yourPhoneVariable and following variable names below "address": { "first_name": yourFirstNameVariable , "last_name": yourLastNameVariable , "street": yourStreetAddressVariable , "city": yourCityVariable , "region": yourRegionVariable , "postal_code": yourPostalCodeVariable , "country": yourCountryVariable } } } ``` <aside style="background: #f1f1ef; width: 100%; border-radius: 4px; padding: 12px 16px 12px 12px"> 💡 Note: you can also hardcode the field with a string or use a function instead of using variables </aside> ### Code sample for normalized and hashed variables ```javascript gtag("set", "user_data", { sha256_email_address: yourNormalizedandHashedEmailVariable, sha256_phone_number: yourNormalizedandHashedPhoneVariable, address: { "address.sha256_first_name": yourNormalizedandHashedFirstNameVariable, "address.sha256_last_name": yourNormalizedandHashedLastNameVariable, city: yourCityVariable, region: yourRegionVariable, postal_code: yourPostalCodeVariable, country: yourCountryVariable, }, }); ``` As mentioned above, the phone number must be in E.164 format, which means it must be 11 to 15 digits including a plus sign (+) prefix and country code with no dashes, brackets or spaces. 1. For each type of customer information in the code above, replace the placeholder variables with the name of the global Javascript variable containing the piece of customer data on the conversion page 1. If your site does not collect a field, remove the field entirely from the code rather than leaving it blank. For instance, if your website only collects emails, your Javascript variable would look like this: ```javascript function(){ return { "email": yourEmailVariable } } ``` 2. Click **Save** The next step is validate that enhanced conversions is working properly. ## Validate Enhanced Conversions Implementation To verify your implementation, navigate the conversion page and follow the steps below. Please test your implementation as soon as you are done with your set up. ### Validate the implementation using Developer Tool 1. Rick click on your web page 2. Click **Inspect** and select the **Network** tab 3. Make sure Network activity is being recorded 4. Enter `google` in the search bar 5. Find the network request that is going to `googleadservices.com/pagead/conversion` or `google.com/pagead/1p-conversion` 6. Click **Payload** to view the list of query string parameters 7. Look for a parameter `em` with a hashed string as the value. The value should start with `tv.1~em` followed by a long string of characters. If you see the `em` parameter, this means that the enhanced conversion tag is picking up and hashing the enhanced_conversion_data object <aside style="background: #f1f1ef; width: 100%; border-radius: 4px; padding: 12px 16px 12px 12px"> 💡 If you see the `em` parameter but you only see `tv.1~em` without a long hashed sting following it then you are sending the enhanced conversions parameter but it is empty. This may happen if user-provided data is not available at the time of conversion. </aside> ### Review the Diagnostics report to confirm your implementation (after 48 hours) After 48 hours of implementing enhanced conversions, you’ll be able to view the tag diagnostics report in Google Ads, which you can use to validate that the implementation is working properly. Follow these instructions to get there: 1. In the Google Ads account, click the Goals icon 2. Click the Conversions drop down in the section menu, then click Summary 3. Click the conversion action that has enhanced conversions enabled 4. Select Diagnostics from the page menu at the top. You’ll be able to see your enhanced conversion tag diagnostics report and the metrics for your enhanced conversion in each section 5. Review the various health checks to make sure everything is working as expected 6. If the tag diagnostics report notified you that something may be wrong, follow the instructions in the notification and the Help Center to troubleshoot

How to flatten Google Analytics 4 Data Schema in BigQuery

July 26, 2025

If you have your Google Analytics 4 data exported to BigQuery, you would have noticed that the schema is quite complicated to work with. Many of the available dimensions are nested in structure and building reports based on said composition can lead to multiple subqueries and ctes being used which increases the difficulty quite a bit. In order to avoid said complexity, we can modify the original exported table in order to make data extraction and manipulation much easier. This will be done though a process known as "**flattening**". In this exercise, we will be taking the original table and modifying it into a structure that is flat where aggregation can be easier to handle. ## The Original Export Schema The GA4 export schema primarily refers to the structure of data when exported to Google BigQuery. This export allows for detailed analysis and custom reporting beyond the standard GA4 interface. ### Key aspects of the GA4 BigQuery export schema #### Events Table: The core of the GA4 export is the *events_YYYYMMDD* table, which contains daily event data. If streaming export is enabled, an events_intraday_YYYYMMDD table is also available for near real-time data. #### Schema Structure: The schema within these tables defines the fields (columns) and their data types. It includes fields for: * event_name: The name of the event (e.g., page_view, purchase). * event_params: A nested record containing key-value pairs for event parameters. * user_pseudo_id: A pseudonymous ID for tracking users. * event_timestamp: The timestamp of the event. * geo: A nested record with geographical information (e.g., country, region). #### Nested and Repeated Fields: The GA4 BigQuery schema utilizes nested and repeated fields (records). This structure allows for storing complex data within a single row, such as multiple event parameters or multiple items in a purchase. #### Data Types: Fields have specific data types, including STRING, INTEGER, RECORD, and REPEATED RECORD. #### User Data (Optional): If user export is enabled and user IDs are collected, additional tables like synonymous_users or users might be included in the export. Understanding this schema is crucial for effectively querying and analyzing GA4 data within BigQuery, especially when dealing with nested and repeated fields, which require specific querying techniques. ## Flattening the table ```sql WITH events_flattened AS ( SELECT event_date, event_name, event_timestamp, event_previous_timestamp, event_value_in_usd, event_bundle_sequence_id, event_server_timestamp_offset, privacy_info.analytics_storage, privacy_info.ads_storage, privacy_info.uses_transient_token, geo.city AS city, geo.country AS country, device.category AS device_category, user_id, user_pseudo_id, user_first_touch_timestamp, traffic_source.name AS traffic_name, traffic_source.medium AS traffic_medium, traffic_source.source AS traffic_source, stream_id, platform, event_dimensions.hostname AS hostname, collected_traffic_source.manual_campaign_id, collected_traffic_source.manual_campaign_name, collected_traffic_source.manual_source, collected_traffic_source.manual_medium, collected_traffic_source.manual_term, collected_traffic_source.manual_content, collected_traffic_source.manual_source_platform, collected_traffic_source.manual_creative_format, collected_traffic_source.manual_marketing_tactic, collected_traffic_source.gclid, collected_traffic_source.dclid, collected_traffic_source.srsltid, is_active_user, batch_event_index, batch_page_id, batch_ordering_id, -- Session Attribution session_traffic_source_last_click.manual_campaign.campaign_id AS session_manual_campaign_id, session_traffic_source_last_click.manual_campaign.campaign_name AS session_manual_campaign_name, session_traffic_source_last_click.manual_campaign.source AS session_manual_source, session_traffic_source_last_click.manual_campaign.medium AS session_manual_medium, session_traffic_source_last_click.manual_campaign.content AS session_manual_content, session_traffic_source_last_click.manual_campaign.source_platform AS session_manual_source_platform, session_traffic_source_last_click.manual_campaign.creative_format AS session_manual_creative_format, session_traffic_source_last_click.manual_campaign.marketing_tactic AS session_manual_marketing_tactic, -- Cross Channel Attribution session_traffic_source_last_click.cross_channel_campaign.campaign_id AS cross_channel_campaign_id, session_traffic_source_last_click.cross_channel_campaign.campaign_name AS cross_channel_campaign_name, session_traffic_source_last_click.cross_channel_campaign.source AS cross_channel_source, session_traffic_source_last_click.cross_channel_campaign.medium AS cross_channel_medium, session_traffic_source_last_click.cross_channel_campaign.source_platform AS cross_channel_source_platform, session_traffic_source_last_click.cross_channel_campaign.default_channel_group AS cross_channel_default_channel_group, session_traffic_source_last_click.cross_channel_campaign.primary_channel_group AS cross_channel_default_primary_channel_group, -- Flattening event parameters COALESCE(event_params.key, '') AS param_key, COALESCE(event_params.value.string_value, CAST(event_params.value.int_value AS STRING), CAST(event_params.value.double_value AS STRING), '') AS param_value FROM `<your- -query-project>.events_2024*`, UNNEST(event_params) AS event_params ), final_flat AS ( SELECT * FROM events_flattened PIVOT( MAX(param_value) FOR param_key IN ( 'term' AS term, 'ga_session_id' AS ga_session_id, 'source' AS source, 'form_destination' AS form_destination, 'form_length' AS form_length, 'engagement_time_msec' AS engagement_time_msec, 'batch_ordering_id' AS batch_ordering_id, 'entrances' AS entrances, 'error_url' AS error_url, 'error_message' AS error_message, 'page_title' AS page_title, 'campaign' AS campaign, 'medium' AS medium, 'link_domain' AS link_domain, 'ga_session_number' AS ga_session_number, 'batch_page_id' AS batch_page_id, 'synthetic_bundle' AS synthetic_bundle, 'session_engaged' AS session_engaged, 'ignore_referrer' AS ignore_referrer, 'outbound' AS outbound, 'error_line' AS error_line, 'form_id' AS form_id, 'page_location' AS page_location, 'page_referrer' AS page_referrer, 'engaged_session_event' AS engaged_session_event, 'percent_scrolled' AS percent_scrolled, 'debug_mode' AS debug_mode, )) ORDER BY event_date, event_name DESC ) SELECT * FROM final_flat ORDER BY 1 DESC, 2 DESC; ``` The following query flattens the schema and allows you to interact with the table as a regular (traditional) table. Let's explore it. ### Read & Unnest (events_flattened CTE) ```sql WITH events_flattened AS ( SELECT -- Top‑level event fields event_date, event_name, event_timestamp, event_previous_timestamp, event_value_in_usd, event_bundle_sequence_id, event_server_timestamp_offset, -- Privacy flags privacy_info.analytics_storage, privacy_info.ads_storage, privacy_info.uses_transient_token, -- Geo & device geo.city AS city, geo.country AS country, device.category AS device_category, -- User identifiers user_id, user_pseudo_id, user_first_touch_timestamp, -- Hit‑level traffic source (last non‑direct click) traffic_source.name AS traffic_name, traffic_source.medium AS traffic_medium, traffic_source.source AS traffic_source, -- Stream & platform stream_id, platform, -- Hostname & manual campaign fields event_dimensions.hostname, collected_traffic_source.*, -- “Active user” flag and batch indices is_active_user, batch_event_index, batch_page_id, batch_ordering_id, -- Session Attribution (last-click UTM tags) session_traffic_source_last_click.manual_campaign.*, -- Cross‑Channel Attribution (GA4’s channel buckets) session_traffic_source_last_click.cross_channel_campaign.*, -- Turn each event_param into its own row COALESCE(event_params.key, '') AS param_key, COALESCE( event_params.value.string_value, CAST(event_params.value.int_value AS STRING), CAST(event_params.value.double_value AS STRING), '' ) AS param_value FROM `<your-project>.<your-dataset>.events_2024*`, UNNEST(event_params) AS event_params ) ``` 1. Top‑level fields: pulls out every piece of context you need (timestamps, user IDs, geo, device, privacy). 2. collected_traffic_source.*: any manual UTM‑style fields you passed in with your tags. 3. Session attribution: under session_traffic_source_last_click.manual_campaign.*, you get the UTM tags that actually started that session. 4. Cross‑channel attribution: under session_traffic_source_last_click.cross_channel_campaign.*, you get GA4’s bucketed source/medium/channel group. 5. UNNEST(event_params): explodes each event’s parameter array into multiple rows, each with param_key and its stringified param_value. ### Pivot Back Into Columns (final_flat CTE) ```sql final_flat AS ( SELECT * FROM events_flattened PIVOT ( MAX(param_value) FOR param_key IN ( 'term' AS term, 'ga_session_id' AS ga_session_id, 'source' AS source, /* …list every key you care about… */ 'debug_mode' AS debug_mode ) ) ORDER BY event_date, event_name DESC ) ``` * Why MAX()? After unnesting, an event can have many rows (one per param). Pivot requires an aggregation, so we pick MAX, which on strings just picks the non‑NULL value. * Key list: explicitly enumerate the event parameters you want as columns. Anything not listed will be dropped. ### Final Output ```sql SELECT * FROM final_flat ORDER BY event_date DESC, event_name DESC; ``` One final sort and you have a single table where each row is one GA4 event, with: * All core event and user context as columns * Exact UTM/manual campaign tags from when the session began * GA4’s channel grouping for easy channel‑level reporting * Every event parameter you care about as its own column* ### Understanding Attribution | Attribution Type | Source of Truth | Key Fields | | -------------------- | -------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------- | | **Session (Manual)** | Your UTM or manual campaign parameters that actually launched the session (last non‑direct click). | `session_manual_campaign_*` | | **Cross‑Channel** | GA4’s internal source/medium and channel buckets (built from UTMs + defaults + your custom rules). | `cross_channel_*` + `cross_channel_default_channel_group` + `cross_channel_default_primary_channel_group` | * Manual campaign fields give you raw campaign/UTM data exactly as you tagged it. * Cross‑channel fields let you slice by GA4’s standard channel definitions (Organic Search, Paid Social, Email, etc.) without extra work. ### What about E-commerce data? While this query does not cover the flattening of E-commerce related columns, the exact same approach can be utilized to achieve the same result with the said columns. ### What's next? Now that you have a truly flat, analysis‑ready table, one row per event, complete with all attribution and parameters, you can ditch those repeated UNNESTs in every query. Use this table for advanced segment and cohort analyses or to build custom attribution models beyond GA4’s defaults. And when you connect it to Looker Studio, dashboard creation becomes far more straightforward than wrestling with the raw export schema.