Where Customer Data Comes From: Zero, First, Second, Third Party
Knowing what data is inside a CDP is only half the picture. The other half is customer data ownership: who collected it, and what rights you have.
Zero, first, second, and third party are not data types. They are ownership labels. The difference decides what you can legally do, what you can ethically do, and how far personalization can go before it feels creepy.
Two questions, not one
Types tell you what the data is. Profile, behavioral, transactional, engagement, consent, preference, derived. That framing lives inside your CDP.
Parties tell you where the data came from. Who collected it. Who owns it. Who has to consent to its use.
Both questions apply to every record. An email address is profile data by type. It is first-party data by source if a customer typed it into your signup form, second-party if a partner shared it under agreement, third-party if you bought it from a broker. The record looks the same. The rights attached to it are not.
Most teams treat these two dimensions as one flat taxonomy. It does not work. You need both.
Zero-party data
What the customer chose to tell you. Explicitly.
Quiz answers, preference center selections, stated interests, self-declared demographics, survey responses, saved filters. Anything the customer knew they were giving you and gave on purpose.
Zero-party data is the highest-quality input you can have for personalization. It is explicit, it carries intent, and consent is built into the act of giving it. Nobody fills out a preference quiz by accident.
The constraint is volume. Customers only give you zero-party data when you ask, and only when the ask is worth their time. That means every zero-party data collection has to earn its place. A three-question quiz that shapes the first email they get, yes. A twenty-field preference center nobody updates, no.
When zero-party data works, it works because the customer sees the loop close. They told you they prefer email. The next day they got an email, not a push. Without that loop, the collection feels like a tax.
First-party data
What you observed from direct interaction with the customer.
Your website, your app, your store, your call center, your CRM, your loyalty system. Anything the customer did on a surface you own, under a relationship you have.
This is the default layer in any modern customer data platform. For most teams the CDP first-party data set is where the seven data types live. Behavioral events come from your site. Transactions come from your checkout. Engagement comes from your campaigns. If you run a single-brand business, first-party is probably 80% of what sits in your CDP.
Ownership is yours, full stop, as long as you collected it under valid consent. That is the constraint. First-party does not mean consent-free. It means the customer interacted with you, not with someone else. Consent rules still apply.
First-party data is the only source that scales with your business automatically. More customers, more sessions, more transactions, more data. No partnerships to renegotiate, no brokers to pay. This is why the entire industry has been quietly shifting weight onto it for the last five years.
The limit of first-party data is reach. It only covers customers who already interact with you. It tells you nothing about the people who have never visited your site or opened your app. For everything beyond your own audience, you need another party.
Second-party data
Someone else's first-party data, shared with you by explicit agreement.
A bank partners with a telco. Their loyalty program meets your credit card. Customers who opted into both see offers co-designed by the two companies. Each side shares profile attributes, transactional signals, or engagement outcomes that the other side could not collect alone.
Second-party is rare, and it is powerful where it exists. Rare because it requires trust, contract, and governance on both sides. Powerful because the data quality is first-party grade. The partner collected it under their own consent and terms, not scraped from the open web.
The legal structure matters more than the technology. A second-party data relationship without a documented data sharing agreement is a compliance risk. A second-party relationship with a clean agreement is one of the strongest inputs a CDP can take in.
Two things make second-party fragile. First, consent is narrow. The customer consented to sharing with one named partner, not with the world. The moment the data leaves the agreed channels, it becomes something else. Second, the partnership ends. When it does, the data either gets removed or continues to live in your systems under the wrong assumptions. Write the offboarding clause before you write the onboarding one.
Third-party data
Data bought from brokers. Aggregated from sources you do not own, often from customers who never knew you specifically.
Interest segments, demographic models, inferred purchase intent, behavioral lookalikes built from cookie graphs. Historically the fuel of display advertising, programmatic bidding, and cold prospecting.
Third-party data is getting harder to use every year. Safari blocks third-party cookies by default. Firefox isolates them per site, which breaks cross-site tracking even when the cookie itself survives. Chrome reversed its deprecation plan in 2024 and kept cookies in place, but Apple's tracking prompts, GDPR enforcement, state-level US privacy laws, and regulators sharpening their focus on consent all point the same direction. The cost of third-party data goes up. The accuracy goes down. The legal exposure rises every year.
That does not mean third-party is zero. It still has a role in upper-funnel reach, contextual targeting, and reference demographics. But treating it as a core personalization input is a bet against the direction everything is moving. A cookieless marketing strategy is no longer the speculative plan. It is the default assumption for 2027.
If your CDP or activation stack depends heavily on third-party signal, the work now is to replace that dependency with zero-party and first-party equivalents before the cost-benefit flips for good. For most companies, that work is at least a year long. Start it on a calm month, not a crisis month.
The matrix no one draws
Type and source are the two axes. Together they form a matrix that is more useful than any single label.
One example. A churn probability score is derived data by type. It is first-party by source if you built it from your own behavioral and transactional history. If your partner calculated a matching score on their side of a joint program and you ingested it, the same derived score is second-party instead. Same metric, different rights, different retention rules, different usage limits.
Another. An email address sitting in your newsletter list is profile by type. It is zero-party if the customer typed it in during a quiz. It is first-party if it came out of your e-commerce checkout. It is second-party if a partner passed it under a documented sharing agreement. The four cells describe four different legal and ethical situations for what looks like the same row in a database.
Most teams force everything into one flat label, usually "first-party," and quietly hope nothing in there is actually second-party. That works until a regulator asks, a customer files a data subject request, or a partnership ends.
What the shift means for personalization
Personalization quality is a function of the ratio of zero and first-party data in your inputs. The higher that ratio, the more accurate the targeting, the cleaner the consent trail, and the lower the privacy risk.
Teams that built their stack during the third-party cookie era are finding that their personalization is degrading without any change on their end. Audiences that used to match at 70% now match at 40%. Lookalike models trained on cookie-era behavior drift. The data is not getting worse. The world is getting tighter around how that data can be collected and used.
The companies coming out ahead are the ones who treat first-party data strategy and zero-party data strategy as strategic investments, not compliance afterthoughts. A well-designed preference center collects zero-party at scale. A clean event taxonomy collects first-party without drift. Both compound over time. Neither requires buying your way out of the cookie apocalypse. This is what privacy-first marketing looks like when you get past the slogan: more signal you collected yourself, less signal you rented.
What to do with this
Map your customer data across both dimensions. Not just by type. By source.
Three questions per cell:
How much of our usable profile data is zero-party versus first-party? If it is all first-party inference, you are leaving accuracy on the table.
Are there any records in our CDP that should be labeled second-party but are not? Revisit any ingestion from a partner feed. Check the agreement.
How much of our activation still depends on third-party audiences? Is there a plan to replace each one?
Most teams do this once, find gaps, and never look at it again. Treat it as an annual audit. Consent regimes change, partnerships end, platforms deprecate cookies on their own schedule. The map needs refreshing every year.
For the other axis, the seven types, see Seven Types of Customer Data Inside a CDP. For how a CDP sits against a CRM, see CDP vs CRM in 2026. If you are still building the internal case for better customer data infrastructure, What Is a CDP for Managers is the shortest path to a shared vocabulary with your leadership.
Seven types tell you what the data is. Four parties tell you who owns it. A CDP that respects both is the one you can still defend in two years.



