Why First-Party Data is the Antidote to the Internet of Bots

Programmatic advertisers function in an internet environment today that is both massive in scale and highly volatile in terms of regulation and tech disruption.

When advertisers deploy campaigns today, they face a host of challenges in targeting audiences that could provide return on ad spend (ROAS). Platforms that advertisers could rely on in the past for ROAS are now increasingly full of bots and audiences that provide little to no value to their customers. According to some estimates, bots and fake online users comprise as much as 40% of total internet traffic.

The result occurring more and more often is that advertisers will blow large ad budgets targeting audiences and consumers that are fake or simply not interested. But why are advertisers and platforms alike allowing this to happen? And why is it happening now more than ever before? This is what we’ll explore below.

The ‘kafkaesque’ state of digital advertising

In a nutshell, fake online users are proliferating on the internet because there is money in bots. A lot of money. Moreover, digital advertising platforms are largely responsible for what is now an internet dominated by automated users. WIRED labeled this new reality of bots dominating the internet ‘kafkaesque,’ however, when we look at the individual platforms enabling this proliferation, we begin to identify the financial incentives.

Social media platforms like Twitter have been key in this incentivization of bots across the internet. In September 2022 on deposition in Congress, Twitter whistleblower Peiter “Mudge” Zatko publicly exposed that Twitter’s executives’ bonuses were tied to increases in daily users – underscoring the fact that Twitter executives not only had incentive to look the other way on a growth in bot accounts, but also that they actively made money when new fake user accounts were added to the platform.

Additionally, following the highly publicized takeover of Twitter by Elon Musk, Musk polled Twitter users asking if he should step down, and users voted 57% in favor of that move. Afterwards, Musk quickly implemented a policy restricting polls exclusively to Twitter blue subscribers to reduce noise from bots on the platform – directly conflicting his previous claim that all the Twitter bots were gone.

What seems clear is that digital platforms have their hands in the pockets of bots, and will use them to their financial advantage related to their platforms’ advertising businesses. But why do companies continue to deploy advertising campaigns on digital platforms if they are aware that bots are everywhere?

Why businesses are playing along

There are a few pieces to understanding why advertisers continue to spend huge dollar amounts on advertising campaigns that end up being served to fake accounts. The first of these pieces is the fact that advertisers don’t have many other options. Sure, they can decide not to deploy on a given platform, but with fake accounts and bots across every major digital platform, advertisers have to choose between not advertising at all, or potentially paying to target some bots.

The second piece is that audience data is really easy to fake. Third-party data – or data bought or obtained from a third-party source – is everywhere, but unfortunately that also means it’s hard to separate good data from bad (or even intentionally misleading) data. This has driven a data buying ecosystem where companies can create fake data at will, and they can even sell it to customers multiple times over and over. A typical third-party data list might contain emails that are 10-years-old or more, and most certainly contains a number of fake users or other data that is completely irrelevant to the purposes of the campaign it’s sold for.

The good news is that advertisers are beginning to catch on to these manipulative tricks, and pushing back on platforms to ensure they aren’t wasting dollars and improving their ROAS. One of the best ways for programmatic advertisers to start improving their ROAS is with first-party data. 

Is first-party data really any better?

The short answer is a resounding ‘yes.’ First-party data – or data that is collected and owned by a company for their own purposes – is significantly better for targeting potential customers than using third-party data. First-party data gives brands quality insights into their audience from people who they know are paying customers. This is also referred to as ‘deterministic’ data, as opposed to probabilistic third-party data – which makes educated guesses at audiences that are relevant. Leveraging deterministic, first party data is much more optimal for microtargeting campaigns.

Many companies want to use first-party data, but do not know where or how to start. For these companies, it’s important to understand that you can start anywhere with collecting first-party data. One of the best ways to start small and gain actionable first-party data is to ask one question of your customers at the point they buy or afterwards. Picking just one question can separate your brand from the majority of branded surveys that ask for a short phone call or “5-minutes of your time” because in reality, most consumers do not want to volunteer their time for a company who they have already paid for a product or service. So instead, identify that one question that you’d really like to know about your customer and make it the only question you ask. 

Many advertisers also assume that ad targeting with first-party data means you’re only targeting existing customers. But what they neglect to realize is that rich, first-party data can be used to build custom audiences with machine learning. By analyzing and comparing your customer data to other deterministic, first-party datasets, advertisers can solicit new deterministic lists of audiences who share characteristics of their existing customer base who are not currently customers. If this same process was used with a list of third-party data, you could identify similar audiences, but those audiences would share the same fake user characteristics that plague third-party data today.

Advertisers and the platforms they fund have long utilized and been satisfied with poor quality data, but luckily that is starting to change. There’s now a better understanding of the ripple effects of bad quality data and what it means for their return on ad spend, and this is catalyzing a trend of more investigation into where advertising data comes from. As programmatic advertisers continue to evaluate their long-term advertising goals, first-party data will no doubt be central to those strategies.