BDEX Guide: How to Build an Identity Graph for Your Identity Resolution Strategy

Identity resolution in programmatic advertising remains a major challenge for brands. With changes to digital data privacy at the legislative level and the cookie-less future impeding brands’ ability to leverage third-party data, organizations will be forced to find new ways of accessing and targeting their audiences.

Building an identity graph to conduct identity resolution remains the best way to understand your audience at scale. But in order to understand identity graphs, let’s start by discussing what we mean by ‘identities.’


Any digital indicator or token that is unique to you and helps categorize you based on your interests is referred to as an identity in programmatic advertising. The device on which you access a website, your device’s ID, browser cookies, where you live, and other data can all be used to improve identity graphs. In addition to buying this type of data from vendors, brands also work to solicit data directly from their customers through a variety of means including establishing digital channels like apps or websites. By having users log in and peruse freely throughout the day, brands can collect a robust dataset about that individual.

The goal is being able to take all of those pieces of data and connect them together so the brand can derive a unified view of their customer. This unified view compiles the customer’s past interactions, interests, and purchases so you can better understand what content will land with them. But the challenge is that the number of data points makes the process of aggregating and synthesizing that data a massive undertaking. Additionally, different data types make aligning them together similar to comparing apples to oranges. In order to fully understand the customer through identity resolution, the data must be evaluated structured, analyzed and acted upon in a uniform manner.

Here are three tips for how to effectively structure identity data for identity resolution at scale:

Apply Machine Learning

Learning more about customers is the most important piece of the identity resolution puzzle, so applying machine learning to your datasets is a critical step to draw linkages. Machine learning levels up your identity-based marketing because it can help detect patterns in your data, suggesting the same customer is using multiple devices.

While machine learning is typically used to draw links between disparate third-party data, one of the most important uses of first party data moving forward will be how it is used to help machine learning better draw those linkages. It starts with applying machine learning to rich, first-party data that allows them to understand customers’ buying habits and where they can find more of them. First-party data with known linkages can illustrate to algorithms what makes a linkage likely, helping the algorithms better draw links between data points and better understand who their customers are.

Use a Deterministic-Data-First Approach

Conducting identity resolution becomes particularly difficult when the data programmatic advertisers use is probabilistic rather than deterministic. Deterministic identity resolution refers to pairing identifying information, like a name or email address, to multiple devices. Conversely, probabilistic identity resolution involves making inferences and using algorithms that predict the links between devices.

Deterministic identity matching provides direct links between data points, meaning that you know two data points are associated together – as opposed to probabilistic which makes guesses at these linkages. Deterministic identity resolution makes sure the view of the customer is accurate and cross-channel and is considered the most precise identity-based marketing approach with identity resolution.

While deterministic identity resolution is much more accurate, a pro of the probabilistic approach is it has the potential to identify much more customer data because the strategy links devices and identifiers based on algorithms, thereby helping create more links with a certain level of confidence. Instead of just analyzing one-to-one matches like deterministic matching, probabilistic uses a wider variety of data which may not always be as accurate as deterministic. And problems can arise if linkages are incorrect.

While both types of identity resolution have pros and cons, programmatic advertisers can know they are targeting the right people if they start with a deterministic-data-first approach.

Find New Ways to Produce First-Party Data

With third-party cookies disappearing, companies are finding it harder to purchase third-party data on audiences that matter to them and being forced to find new ways to collect first-party data. Historically, third-party data has always been relatively bad, and requires a lot of guessing even if those guesses are made by algorithms. First-party data includes deterministic data, but it also includes other valuable information like behavioral data and purchase history. In a study by Lytics, 92% of brands surveyed considered first-party data “more important than ever” amid the deprecation of third-party cookies.

Relying on third-party data is dangerous, so one thing programmatic advertisers can do is go right to the source and get first-party data from their customers. This could include encouraging web visitors to create an account or download an app that allows the business to see more granularity on their activities. We continue to see more brands using promotions or personalized products to acquire customer information and amass their first-party arsenal. Doing so allows them to build out their customer profiles to better personalize their offering to each customer’s needs, increasing their marketing ROI.

What can advertisers do?

Advertisers can also elect to work with a vendor who has access to their ideal customer segments and leverage the appropriate data to access them. As third-party data phases out of favor, we could also see large companies with sizable datasets start licensing data to brands as a form of passive income.

While likely a painful process at first, reorienting to a first-party data approach can help make identity graphs cleaner and much more accurate simply because they are based on information you know to be factual rather than prospectively factual. The challenge, however, is to also be able to identify when fraudulent data from things like bots and click farms are affecting the quality of your identity graph. Bot traffic can easily cause a string of events that literally breaks your identity graph making it look like a consumer is coming in from multiple devices or making it look like you have more users/visitors that you actually do. If looking to build and manage your own identity graph, it is important to have a reliable partner that can provide technology to remove identities linked to bots and other forms of fraudulent data.

For any customer centric business that aims to personalize customer experiences and create meaningful interactions with their brand, it’s critical to understand the entire persona of the customer based on all of their digital interactions. Organizations cannot afford to work on bits and pieces of information which is why collecting identities of a customer that may exist on several platforms and stitching them together for a unified view of the customer is so important.

To learn more about how to get started with a deterministic approach to identity resolution, visit: # # #