min read

What exactly is a Data Product anyway? (Plus, let’s talk Shampoo!)

Data has evolved into an asset that can be packaged, sold, and leveraged independently: a Data Product.
Written by
Alec Whitten
Published on
17 January 2022

Traditionally, businesses saw data as a byproduct of their operations. It was deemed valuable, but its primary function was to aid internal processes like analytics, operations, and decision-making. However, with the advent of the modern data stack, data has evolved into an asset that can be packaged, sold, and leveraged independently. This concept of treating data like a product has gained momentum in recent years, with leading voices like McKinsey outlining how this approach can allow companies to deliver new use cases 90% faster while reducing total cost of ownership by 30%.

The attributes of a “Product”

To grasp the concept of a Data Product, let's first dissect the attributes of another more familiar product - shampoo:

There's a lot we can learn about what a product is from a common shampoo bottle.


The packaging is the first thing the customer sees. Raw shampoo, without packaging, resembles an unidentifiable substance that you'd probably avoid. But when presented in an appealing bottle, complete with the “Dove” emblem, a product description, its active ingredients, and how-to-use directions, it suddenly represents quality and trustworthiness.


A shampoo bottle in a warehouse benefits no one. For both the consumer and the producer, the product's value is realized only when it reaches the user. It should be available wherever the consumer is shopping - be it a local store, a supermarket, or Amazon. Additionally, the consumer should have confidence that the chosen purchasing channel won't compromise the product's integrity.


In most consumer sectors, the adage "the customer is always right" prevails. This philosophy implies that every product should cater to its user. Whether you have curly, dry, or colored hair, the shampoo should resonate with you. It's not just about marketing; it's a holistic product development strategy. The target audience's needs guide its ingredients, manufacturing, packaging, and price.


Every good product delivers value to both its user and its producer, including shampoo. For consumers, the shampoo could provide moisture, volume, or dandruff control. In return, they're ready to spend their money on it. Without delivering clear, tangible benefits that users can quantify and value, the product wouldn't have a market presence.

A good product provides context, is easily accessible, caters to a specific audience, and provides tangible value.

Enter Data Products

Data by itself is much like the raw shampoo described above - no one knows what to do with it, how to access it, what it means, and whether it’s worth anything. Hence the need for Data Products:

Data with context

Data Products need packaging. Something that tells us what it is, where it came from, and some ideas of what it can do. We’ve seen a recent rise in enterprise Data Catalogs as a way to document metadata, but these catalogs are akin to a collection of cover pages. True Data Products merge context, instructions, and quality assurance with the data, enabling even those without technical expertise to use them.

Decoupling data from storage technology

Data Products shield users from the intricacies of the underlying technology. Consumers shouldn't concern themselves with the underlying database type, file structure, or data format. This is a tough engineering challenge, as Forrester’s report shows that data teams spend 70% of their time just prepping the data for analysis - but that is the goal of a Data Product - making data accessible wherever the consumer wants, whether it is Snowflake, an API, or even Google Sheets.

A data product should be available wherever the consumer wants, regardless of the underlying source.

Data with a purpose

A Data Product is not data sitting in a data lake or database, data products cater to a specific audience or a specific use-case. This means work needs to go into refining a data set into a data product - tables need to be joined, data needs to be aggregated, PII redacted, etc. This is probably why Harvard recently wrote an article on why companies need Data-Product Managers, and teams of data engineers, analytics engineers, analysts, and data scientists are deployed against this task.

Data that is valuable in and of itself

There are multiple ways to extract value from your data, the baseline case being internal analytics. Data Products push us to consider other ways data can derive value like intelligent applications that leverage data to make recommendations (i.e. Netflix) or to improve our daily lives (i.e. Google Maps). Increasingly we’re also seeing a rise in companies packaging, sharing, and monetizing their data itself (Data as THE product) with the demand for external data sets to grow 3x in the next three years.

Data Products are a new concept but a powerful one - underpinning new paradigms like Data Mesh, Semantic layers, and Data Marketplaces. Organizations that invest in platforms that help them treat their data as a product are better positioned to innovate, optimize operations, and gain a competitive edge in the new data economy.

Other "Data Products" posts you might like:
Data Products
 min read

Are you getting the full value from your data? Chances are, you’re leaving something on the table.

Despite all the time, energy, and money invested in data, companies are still likely missing out on the full potential of their data.
Read post
Data Products
 min read

Data Products vs. Data as a Product: Same, Different, or Something Else?

"Data Product" and "Data as a Product." Though they sound similar, they actually fall on a continuum that we explore in this post.
Read post