Summary:
Identity resolution is crucial for creating accurate customer profiles by connecting fragmented digital data, such as emails and device IDs, into a unified record. This process enhances marketing efficiency by improving targeting, personalization, and attribution. However, the effectiveness of identity resolution depends on the methods used—deterministic for accuracy and probabilistic for reach—and the reliability of the data. As privacy regulations tighten and third-party cookies phase out, leveraging first-party data through robust identity resolution becomes essential. Companies must ensure their identity systems are accurate and privacy-compliant to optimize marketing efforts and build durable customer relationships.
A few years back, while building identity resolution systems, we created what we called a Canary File — about 2,000 verified records of people we personally knew. Friends, family, colleagues. People whose correct name, email, and address could be confirmed with absolute certainty.
We ran that file through some of the biggest, most respected data companies in the industry. The gold standard players. The ones with the slickest pitch decks and the most confident match-rate claims.
The best of them came back at 30 to 40 percent accurate.
Thirty to forty percent.
On verified data. Against the best providers available.
That result isn't just surprising — it's a window into why so many digital campaigns underperform in ways that are genuinely hard to diagnose. The targeting looks right. The audiences look qualified. But somewhere underneath the dashboard, the data is wrong, and you're paying full price to reach people who either don't exist or aren't who you think they are.
That's the problem identity resolution is supposed to solve. And when it's done well, it does.
What Identity Resolution Actually Does
The average person leaves a fragmented digital trail. A work email and a personal email, maybe an old one they still check occasionally. Desktop browsing at the office, mobile browsing everywhere else. Sites they log into and sites they visit anonymously. An address from two moves ago still attached to a loyalty account somewhere.
Identity resolution is the discipline of connecting those fragments — email addresses, device IDs, cookies, phone numbers, postal addresses, behavioral signals — and determining whether they belong to the same real person. The output is a unified customer profile: a single, coherent record that travels across channels and campaigns instead of living in disconnected silos.
When that works, every downstream marketing system gets smarter. Targeting improves because you know who you're actually talking to. Personalization becomes possible because you have a real picture of the customer, not a patchwork of guesses. Attribution gets more accurate because you can trace a conversion back across multiple touchpoints instead of crediting whichever one happened last.
When it doesn't work — when the underlying match data is stale, inaccurate, or built on probabilistic guesses dressed up as confirmed identities — you get expensive noise that looks like signal on a dashboard.
Two Approaches: Deterministic and Probabilistic
Every identity resolution system operates on one of two methods, or a combination of both.
Deterministic matching works from exact data points. You have someone's email address; the database has that exact email address — confirmed match. You have a phone number; they have the same phone number — confirmed match. It's highly accurate when it works, typically in the 70–80% range for solid implementations. The limitation is coverage: if a data point is misspelled, outdated, or simply missing, deterministic matching has nothing to anchor to.
Probabilistic matching operates on likelihood rather than certainty. It analyzes patterns — IP addresses, device types, operating systems, location signals, behavioral indicators — and calculates the probability that two data points belong to the same person. It extends reach significantly beyond what deterministic matching alone can achieve, but introduces uncertainty by design. Look-alike audiences on Facebook and Google are probabilistic. Your own CRM records are deterministic. Both have a role. Neither is the full story on its own.
The practical flag to watch: a vendor claiming extremely high match rates through primarily probabilistic methods deserves a hard question or two. High reach with low verification is a very efficient way to build a large audience of the wrong people.
Why This Matters More Right Now
The old infrastructure of digital advertising was built on third-party cookies — files that let advertisers track users across websites and build behavioral profiles without any direct relationship with those users. It was imprecise and privacy-invasive, but it was ubiquitous enough that most of the industry just kept running on it.
That infrastructure is being dismantled. Major browsers have eliminated or are eliminating third-party cookie support. Apple's App Tracking Transparency framework put a significant dent in mobile targeting. Privacy regulations — GDPR in Europe, CCPA in California, and a growing patchwork of state and national laws — are steadily tightening what data can be collected, stored, and used without explicit consent.
What fills the gap? First-party data and identity resolution.
First-party data is the information people give you directly — through your website, your email list, your app, your CRM. It's yours, it's consented, and it's increasingly the only targeting asset that holds up under regulatory scrutiny and platform policy changes simultaneously.
Identity resolution is what makes that first-party data powerful — by connecting the dots across your own customer records into unified profiles that can drive real personalization. The companies that get this right aren't just staying compliant. They're building a durable asset that compounds over time: every customer interaction adds to the identity graph, every enriched profile improves targeting, every accurate match reduces wasted spend.
The Architecture: Identity Graphs
The infrastructure underneath identity resolution is called an identity graph — a database that maps all known identifiers for an individual (emails, phone numbers, device IDs, mobile advertising IDs, postal addresses) and links them to a single unified record.
Building a reliable one is genuinely difficult, for a few reasons that don't get discussed enough.
Data recency is a constant problem. People change email addresses, move houses, get new phones. A database that was accurate eighteen months ago has already degraded in ways that are nearly impossible to detect from the outside. Most commercial identity data has no reliable mechanism for flagging recency — there's no signal telling you whether that email address is still someone's primary account or a relic from a gym membership they cancelled in 2014.
Single-database lookups are a liability. The Canary File test exposed this directly. The approach that emerged from that testing was to query multiple databases simultaneously — five in the case of postal matching — and require at least three to return the same result before treating it as a confident match. Then layer on a mobile location verification step: matching the proposed address against GPS dwell-time data from the person's devices, confirming they actually live where the data says they live. That's what reliable matching actually requires. It's slower and more expensive than a single-database lookup. It's also meaningfully more accurate.
Cross-channel complexity multiplies everything. Someone browsing on desktop at work and returning on mobile that evening is the same person. Without identity resolution, they're two anonymous strangers. With it, they're a known prospect at a specific stage of a buying cycle — and every touchpoint can be calibrated accordingly.
What Solid Identity Resolution Unlocks
When the foundation is right, identity resolution changes the economics of everything built on top of it.
Audience quality that actually moves numbers. The difference between a campaign that cuts customer acquisition costs by 3x and one that doesn't often comes down to seed audience quality, not ad creative. A look-alike model fed with accurately identified, high-intent visitors produces dramatically different results than one fed with broad demographic proxies. The platform's algorithm isn't smarter in one scenario — the input is. Better data in, better output out.
Stopping the waste of retargeting your own customers. It happens constantly: someone converts, becomes a customer, and keeps seeing acquisition ads for the next three weeks because the ad platform doesn't know they bought. Identity resolution — connected to a CRM — creates dynamic audience automations that remove converters from prospect pools the moment they become customers and route them into appropriate post-purchase sequences. That's recovered budget on every campaign, running automatically.
Audiences that don't expire. Traditional retargeting pixels have a shelf life — typically 30 to 90 days. Identity-resolved profiles don't expire the same way. They persist, update, and remain portable across platforms and campaigns. Over time, that becomes an owned asset: a growing identity graph of known customers and prospects that belongs to the business, rather than an audience rented from a platform on a per-campaign basis.
The Privacy Architecture
This is worth being direct about, because it gets conflated with surveillance in ways that aren't accurate or helpful.
Responsible identity resolution doesn't expose personally identifiable information to advertisers. The systems built to last operate on hashed identities throughout — the clear-text email address, the actual name, the postal address never touch the advertiser's platform. What the advertiser receives is an audience segment they can activate. The individual's identity stays protected within the technical layer.
This isn't just ethically right — it's structurally necessary for any identity infrastructure that intends to survive the next several years of regulatory evolution. GDPR and CCPA changed the rules permanently, and the direction of travel is toward more consumer protection, not less. Privacy as an architectural principle isn't a constraint to work around. It's what makes the system durable.
The Honest Takeaway
Identity resolution done well is one of the highest-leverage investments in a marketing stack. It makes audiences smarter, campaigns more efficient, spend less wasteful, and customer relationships more coherent across every touchpoint.
Identity resolution done poorly — single-database lookups, unverified match rates, probabilistic guessing sold as deterministic accuracy — produces expensive noise that's difficult to trace back to the source because everything looks fine from the top of the funnel.
The Canary File result was a hard lesson in taking vendor claims seriously only when they've been tested against ground truth. The best way to evaluate any identity resolution solution is to bring your own verified data and ask to see what comes back.
If the answer is anything above 85%, ask a lot of follow-up questions.
If they won't run the test, that's your answer.
Gil Ortega
The Chief Rainmaker
When You Want Rain
Gil Ortega is the founder of Profit Worldwide, Inc. and the creator of the Chief Rainmaker brand — a San Diego-based marketing strategist focused on customer acquisition, audience engineering, and AI-powered growth systems. He is the author of Give Value Sell Results: Building Predictable Outcome Systems in the Age of AI.
Read more at ChiefRainmaker.com