Category

Google Tag Manager

Excluding internal traffic while anonymizing IP adresses in Google Analytics

Introduction

Due to the recent rollout of the GDPR, it has come to many people’s attention that IP addresses in some cases can be seen as personal identifiable information (PII), and therefore wishes to anonymize the information in Google Analytics.

Two weeks ago I was at Measure Camp in Copenhagen, giving a speech about IP anonymization, and the complications it brings with excluding internal traffic. This blog post sums up the points presented and discussed at the event.

I am not a lawyer , or an expert on how IP addresses work, so this blogpost will revolve more around how to anonymize IP addresses and exclude internal traffic than about GDPR or technical walktroughs on IP addresses.

The issue

Imagine that you have a smaller website with an identifiable ISP. If you go to the website from a specific newsletter campaign and enters the website, it will be quite easy to find you specifically in Google Analytics. All it would take is to:

  • Select network
  • Find your ISP address
  • Create a segment saying: Only show users who came from a specific ISP address and have visited my website through a specific campaign

In other cases, your IP address might be linked to your home address through your teleprovider, and in some cases it will be possible to identify you as a person.

If you know that this might be an issue with some of the users, you can easily anonymize your IP addresses through Google Tag Manager, by setting a custom field with “anonymize ip” to “true”.

What happens is that Google removes the last digits in the IP adress:

While this is all great (Besides screwing up your geography data on a city level), it does provide a major issue, as all your IP exclusion filters will stop working.

To make sure that we can still exclude our internal traffic, we need to figure out whether the user is accounted for as internal or external traffic, before sending data to Google Analytics.

The main inspiration for this, is a real client case, where I was asked to develop a way to exclude internal traffic while anonymizing IP addresses and not having any ressources besides tag manager to help me complete the task. A huge thanks goes to Simo Ahava for writing a similar post, which was used as an inspiration for making this setup.

The solution(s)

In this section, multiple solutions are provided. Use the one that fits your organization / client the best.

Solution A: Only anonymize external traffic

Step 1: Create a regular expression with your IP addresses

First of all, you need to find all your internal IP addresses, and add them into a regular expression in this format:

Step 2: Add them to a script that sets a cookie if the user is internal or external

After that, you need to include it in this script:

What it does is that It calls the service IPFY, and checks the IP of the users that visits the site. Then it matches it up with the regular expression you set up. If it matches it will add a cookie on the user’s browser saying it is internal traffic. If not, it will say add a cookie saying that it is external traffic.

The duration of the cookie is up to you to set. We have decided to let the cookie for external traffic be 7 days to not store that information for longer than needed, and 30 days for internal traffic.

Step 3: Add the cookies set in Google Tag Manager and use them as variables

If you click on create a new variable in Google Tag Manager and select the “1st party cookie” variable, it is possible to grab cookies on a users browser and use that information to control what tags you are sending. This is especially handy when looking at user consent in order to fire tags.

Step 4: Make sure that the script does not fire all the time

Finally add the script as a custom HTML tag on all pages, as long as the cookies are not set:

Step 5: Write a script that turns the cookies set into one variable

The next thing we need is to create a script so we can have it all in one variable. This has an advantage, as it makes it possible to create lookup tables,which will be used to see if a user should be anonymized or not:

Then we create a lookup table that we will use to identify weather the anonymize IP feature should be set on or off:

For some reason, boolean values require that you add ‘ ‘ to make them function within your global analytics settings.

Step 6: Add the lookup table to your Global Analytics settings

Finally, the anonymize IP feature in Google Analytics is set to only be active when it is external traffic on your site. Furthermore, It also sends whether it is internal or external traffic through a custom dimension.

The reason I have chosen to do this instead of just anonymizing everything, and filtering the traffic based on the custom dimension is to know what IP addresses that are internal. With different agencies and stakeholders I believe it is important to know what we are excluding on the site, and be able to check up on which IP a filter belongs to.

Cons

  • Sending your users IP addresses to an external service is not always something that your legal department will be okay with
  • It exposes all your internal IP addresses on the webpage

Pros

  • It is quite effective if you don’t have other means to exclude internal traffic

Excluding your internal traffic based on the custom dimension set

If you want to exclude your internal traffic with a custom dimension, you first need to set it up under custom definitions –> Custom dimensions. From here you need to select the number corresponding with the dimension value selected in Google Tag Manager and set it to a “User level”

Once that is done, you can add an exclusion filter to remove your internal traffic. Remember that this will not work retroactive, meaning that internal traffic will only be excluded from the day you set up the filter, unless it was already set up before you started on this post.

And that should be it, now any IP you add to your list in Google Tag Manager will be filtered out in Google Analytics.

Solution B: The same, but get your developers to do it!

The solution A is made for people who don’t have resources to have IT detect the type of traffic going to your site. If it is, I recommend having them to create a variable in the dataLayer that checks the users IP address server side, and from here you can set your values:

Cons

  • Involving IT takes maintenance, time and usually a bigger budget

Pros

  • Not sending your users IP addresses to third parties

Solution C: Make people in your organization visit a specific site each time they log in on an internal network

The last solution is something I have heard people say that they have done, and is also one of the Methods that Simo have written about in his previous post. It is however not something I have experienced being done successfully. This is simply by adding a unique identifier for all people who needs to be excluded from the site. This can be by:

  • Having them open up a specific email with a UTM code
  • Making the internal visitors visit a specific page and then send a cookie

Cons

  • Getting people to do things is difficult and requires frequent maintenance

Pros

  • If it works, it works well and without having multiple stakeholders involved

Final thoughts

I believe it is important to know why you need / want to exclude your IP addresses, and even more important a general understanding of how your organization uses data, and interact with your user’s behavior in Google Analytics. In terms of anonymizing IP addresses, the first solution presented is a good starting point if you need to take action, and don’t have any resources yourself. I do think that the best approach is to let the developers look at the IP address server side, and push that information to you, however it can be a hassle to get it updated.

I have been discussing my first solution and issue with different clients and nerdy peers. My conclusion so far is that It really depends on what organization you are in, and where you and your legal team stands.

If you feel that any of the solutions I have provided can be used, feel free to implement them, if not, please leave a comment and let me know if you have a better solution to the issue!

At last, a big thanks to Thomas Rhode for helping me write the code.

Tracking Pardot iFrame Forms

If there is one thing that can make everyone working with homepages hate their lifes, it is iFrames. Recently i have worked a lot with Pardot Forms, which embeds into an iFrame (In most cases anyways). To track it, i looked at Ryan Praski ‘s amazing guide to tracking Pardot iFrame Forms, however. It wasn’t dynamic enough for us to use at IMPACT EXTEND. I was fortunately enough to sit with the CEO of the company Thomas Rhode, who also knows his share of JavaScript. Together we created a two part solution:

  1. One script to be imbedded in the iFrame
  2. Another script who listens for the iFrame form to be submitted

Script one – Embedded in the form

On the thank you code function in the Pardot iFrame Forms, use the following script:

Script two – Custom HTML on all pages

Add this to Google Tag Manager:

All you have to do now

If you have submitted the custom HTML tag to all pages, and added the code to your form (And remembered to change the namings and domains in the first script to your own things), then every time a form is submitted, you will be able to see the variables in the DataLayer. With this, you can now tracking it in Google Analytics, send it as a FaceBook conversion or anything else that you like.

That’s it for now. Let me know what you think =)

 

Setting up smart triggers with lookup tabels in Google Tag Manager

I often need to manage a series of tags, where i have to handle multiple business units in multiple languages, where multiple events needs to occur to fire specific variables.

When you are using variables, you are often limited when using lookup tables, as you can only define it from one variable as input / output. However, since the release of the Regex Lookup table, a lot of things have been easier to do.

Everything is an event

Whenever something happens within Google Tag Manager, an event is fired. A DOM load is a gtm.dom event, a Page load is a gtm.load event and so on. In this post I will write about how to use this to make your tracking a bit smarter and your triggers more dynamic.

In my last post i showed how to strip down Floodlight Tag Parameters. As i hate making a ton of tags i thought: “What if i could combine all my floodlight tags into 2 tags“, A counter and a Sales tag, and only having to use the 3 variables necessary  to build them (Category, Source and Type).

To do this i decided to make a very small piece of JavaScript to handle the task:


function() {
var combinedVariables = {{Event}}+{{Page Path}};
return combinedVariables;
}

This is just an example, but it has endless possibilities. Imagine that you want a tag to trigger once some specific DataLayer variables are present on certain pages? Now you can! Just go an add that in your custom variable and select the things you need for being able to fire your tags. See how I set it up here:

Above i have combined the business unit, with a country, with a event, with a pageview. This means that i can switch between any organisation build into the DataLayer and do any type of combination i need. This is quite need, as it gives me the flexibility to use 3 variables to control 2 floodlight tags instead of 20, saving me time and giving others a better overview when having to use multiple marketing tags.

Dealing with multiple Floodlight Tags

I hate setting up Floodlight Tags

And because of that, I decided to try strip down the Floodlight Tag parameters using R. This tutorial is help others in the same pain and suffering like me, and hopefully reduce the time spend on implementing – analyzing, and finding insights instead.

What normally happens

For clients, their marketing team usually wants Floodlight tags to be added to the site, when an event is triggered, or when a page is loaded. While this is completely fair, it can be a hassle to set it up without cluttering up your Google Tag Manager setup, which in many cases starts to look like this:

What is even worse is how the tags are delivered to you

Whenever i am presented with all the floodlight tags marketeers want on the site it is a lot of excel rows to which i manually have to extract 3 values:

  • Source
  • Type
  • Category

This ends up looking like this:

Well, i can just copy the values from there?

Of course you can, however, when someone sends you 100 rows of floodlight tags it starts getting less funny.

R to the rescue

You could do an extraction in excel, however this causes two issues:

  • You need to remember and store the formular you are using
  • You need to re-do the process each time you have to do the task
Adding a script to fix it each time

By using R you ensure that you format your data the exact same way each time, and also it is basically a click & done task.

The script below is my take on how to extract the values from the script:

What does the output give in terms of floodlight tags?

From writing the script you get a data.frame with all the values extracted. From here you can either add it to Google Sheets, or add it into a CSV as I have done in the example.

This is it

In this post i have selected to showcase how to extract data from an excel file with R, and convert it into a bit more usefull format. It is something that have saved me a lot of time, and which i have used in multiple instances.

What’s next?

To see how you can make floodlight tags work even smarter for you in Google Tag Manager, see the next post about making smart triggers with lookup tabels

Tracking the basics in Google Tag Manager

Introduction

I have recently had to set event tracking on a lot of Google Tag Manager containers for various clients. Through that, I have had some time to think of standard events I usually end up setting up. These are:

  • E-mail clicks
  • Phone clicks
  • Social Link Clicks
  • Outbound links
  • Downloads

I therefore did what everybody else would do – No I am not talking about coding it… I made a Google search to find a script that contains everything. Unfortunately, I could not find it, and from there i decided to team up with an x-collegue Markus Kelle, who already build something similar. with a few adjustments and add-ins, the script was in place and ready to share:

The script

Setting up the tag – The howto guide

1. Creating event variables

The event variables are used to capture the data the script pushes to the dataLayer.

2. Setting up the event

Next step in the process is to create the event itself. This makes us able to push the data into Google Analytics.

3. Adding the script to the site

The script is build with a logic saying: If a specific predefined link is clicked, then push an event which our event tag can use to send the information to Google Analytics with.

4. Testing if it works

The last step is triggering events on the site, to see weather they fire. In this case i have used GTM’s own debugger to make it easy to verify.

Final Words

This script is by no means perfectly coded, or intended to be the one and only solution to tracking basic link tracking on your website. This is just one out of many approaches that works for me. Feel free to use it, or comment if you feel that you have a better solution.  meanwhile, i hope you enjoyed this small post and that it is of use to you.