4 Levels of Data Proficiency

Data is the lifeblood of product companies, but with access to lots of information come big challenges. As our brains were not designed to process large amounts of facts and figures and we can easily be overwhelmed and distracted. Cognitive biases may kick in to create simplified and often wrong narratives based on a few salient data points. We’re also tempted to cherry-pick data that support our opinions and disregard everything else. The results are often “data-driven” strategies, roadmaps, and ideas that have very little to do with the reality on the ground. 

A classic example: In the late 2000s, slumping PC sales and the rapid adoption of multi-touch devices convinced Microsoft that personal computers are about to be displaced by tablets. Subsequently in Windows 8 the company replaced the classic start menu with a start screen designed for touch devices. This was a very unpopular change which Microsoft had to gradually roll back over a number of years (Windows 11 comes with a classic desktop and start menu). In hindsight the data didn’t foretell the future. Touch didn’t become a major part of the PC interface; PC sales are now stronger than ever.

Strong product companies make a much more systematic and deliberate use of data. They create “a culture of truth-seeking” to quote Jeff Bezos, which gives them a massive advantage over less data-savvy competitors. 

In this article I’ll suggest 3 types of data proficiency to aim for in your own org: 1) Business Modeling 2) Data-Driven 3) Evidence-Guided. I’ll also discuss a speculative fourth level — artificial intelligence. 

Level 1: Business Modeling 

While everyone intuitively understands the importance of metrics, many companies struggle to use them effectively. One common issue is myopic focus on dozens of business metrics (revenue, ARR, ARPU, profit, costs, paying customers…), while measuring very little (if anything) to do with user experience, value-to-customer, product health, or company vitality. Another common problem is chasing dozens of metrics (sometimes suggested on the Internet) with no clear order of importance, sense of cause-and-effect, or relevance to the company.  

Random selection of metrics frameworks found on the Internet

Strong product companies don’t start with metrics, they start with models. Specifically, they model customer behavior and business growth and how they are interconnected. The models are customized to the business realities and strategic choices of the company, and they surface the most important metrics for it should set goals for and to track.

Here are three types of models that are broadly applicable.

1. Flywheels

The Amazon Flywheel

Flywheel models explain company growth through virtuous cycles of cause-and-effect. The Amazon flywheel shown above is perhaps the most famous example. The core loop shows how greater selection of products improves customer experience, which leads to more traffic in the Amazon stores, which in turn attracts more sellers, further increasing selection. With more traffic, products, and users, Amazon benefits from economies-of-scale advantages that allow it to improve its cost structure. The savings are funneled to the users in the form of lower prices, which in turn spins the core loop even faster. The “nodes” in the flywheel represent the most important factors in Amazon’s success and thus point at the company’s top metrics. 

2. The Value Exchange Loop 

The Value Exchange Loop model simply states that delivering value to the market (users, customers, partners) and capturing value back are the two parts of every company’s mission. The company therefore should start with two top-level metrics, the Top Business KPI and the North Star Metric, that measure captured value and delivered value respectively. Here are some examples:

As simple as this model is, it is also very powerful. It gives the company two clear top-level metrics to improve rather than dozens, it balances revenue-focus with customer-focus, and it allows for objective measurement of impact of any product idea or business initiative.

3. Metrics Trees

The top-level metrics can further be broken down into metrics trees to create a more nuanced model. The metrics at the root of the trees are the most important, but the metrics at the leaves are the most actionable and can be assigned to individual teams or groups to work on.  

To learn more about the value exchange loop and metrics trees see this article

Level 2: Data-Driven

Many companies pay lip service to being “data-driven”, yet their decisions are very much guided by good-old opinions and consensus. Truly data-driven companies are ones where data — both quantitative and qualitative — is central to the culture and to the operation of the org (Jeff Bezos again: “We go where the data leads us”). Counter to popular belief, in data-driven companies data is meant to complement human judgment, not to replace it. 

Here are some important traits of data-driven companies. 

1. Investment In Data Collection and Processing

In many product companies I meet people are held back by a common challenge: we don’t measure that. With lack of true measurement and data collection the product org, and the company as a whole, are flying in the dark and thus have to resort to opinions and intuition. 

Data-driven companies continuously invest in data collection, processing, and analysis. This includes ingesting data from multiple sources, building data pipelines, cleaning and processing the data, and constructing data warehouses or data lakes. For a quick overview I recommend watching this short video by ByteByteGo:

Data-driven companies also invest in qualitative data, which includes hiring user researchers and regularly conducting user interviews and field research, as well as closely observing market trends and the competitive landscape. Raw data is stored and processed and the findings are shared broadly.

Before you dismiss this as another not-possible-in-my-company thing, consider that even in data-driven companies many data projects are started bottom-up at the initiative of product teams. Don’t wait for permission and funding. Today there are many affordable off-the-shelf tools and open source solutions to get your started. If you’re just starting, I’d recommend against launching a massive data project that is likely to stretch from quarters to years. Instead start with measuring the things you really need to know now, and extend your data infrastructure and analysis over time. 

2. Data Analysis Rituals  

Even companies that have access to the data may not make good use of it. Infrequent and inconsistent data reviews are common, as well as over focus only on specific metrics while missing the bigger picture. Data-driven companies build processes to ensure leaders at all levels are aware of the bigger data picture and making good, data-informed decisions. In this article Sachin Rekhi describes some of these practices based on his experiences at LinkedIn and SurveyMonkey. One of these is the construction of dashboards. The most important metrics from the model we discussed earlier, plus important related metrics are placed at an executive-level dashboard.

Dashboard implemented with MixPanel

Another important ritual is the Weekly Metrics Review meetings, that are often run by a senior executive, ideally the CEO, around the dashboard. The meetings follow a semi-fixed agenda:  

“The first part of the meeting should be about discussing the metrics in the dashboard. You might go around by product or by funnel stage (growth, engagement, retention) or you might go through recent A/B test results. Every metric on the dashboard should have a clear owner and they should speak to the changes that have been seen in the past week and their headline commentary about it. They might also mention upcoming initiatives or changes they expect on the business performance. Everyone in the room then has an opportunity to ask questions about what they are seeing in the data or share their perspective. This discussion is critical because it allows teams to share with others what they are seeing and allow the team to collectively help explain what might be causing it, suggest further follow-up investigations that might be helpful, and so on. It’s from these discussions that the real learning happens. For any question that the team can’t answer, it should be decided in the meeting whether there should be a follow-up action item taken to investigate. Those should be documented and then discussed in the subsequent meeting.” – Sachin Rekhi / A Leader’s Guide to Metrics Reviews. 

3. Self-Serve Data Access

In data-driven companies, data access is not the prerogative of executives, nor do you need to go through the Business Intelligence team. The data (with some level of authorization) is broadly available to query and to consume. Teams may use self-serve analytics tools, queries or APIs. Teams often build their own dashboards and meet around them regularly. 

Here again it’s tempting to pardon yourself if your company doesn’t yet offer an analytics tool that puts every metric at your fingertips, but I have to tell you that even at data-driven companies, while the data is available, consumption is not always easy. For long stretches during my time at Gmail I relied on database queries (often run for me by generous engineers), an archaic home-grown analytics tool seemingly created by engineers for engineers, and spreadsheets analysis and visualization (I later learned to use the R programming language). Working this way is not optimal, but if you wait for the perfect analytics system to be put in place you may end up waiting a very long time. 

Level-3: Evidence-Guided

Data and evidence are not one and the same. Data may not always be meaningful — it may not tell any clear story. Evidence is data that confirms or refutes our assumptions with respect to users, market, and business. Our experience with science, medicine and law shows how important evidence is in supercharging human judgement and driving better decisions. Companies that are evidence-guided, test the assumptions in their product and business ideas and act on the evidence. They park or pivot ideas that don’t work, and double-down the ideas that do, a practice now known as product discovery.

This is a topic I’ve written extensively about, including in my book Evidence-Guided, so I will not go very deep into it here, except to say that to be evidence-guided you need to both get good at testing assumptions (validation) and at analyzing impact and costs inlight of the evidence (Evaluation). These are data-centric activities that are built on top of the two proficiency levels I covered above: business-modeling, and data-driven. But here too, you should start with what you have and not wait for a great testing platform to arrive. There are many validation techniques, and many of which — for example user interviews, surveys, smoke tests — don’t require sophisticated infrastructure. 

The AFTER model for idea validation (see my handbook for details)

Level 4: AI-Powered 

With all the recent advancement made by artificial intelligence, and especially generative ML with all its human-like emergent behaviors, we have to ask how AI will affect how we work with data. This part is mostly speculative as I believe no one knows for sure. Still there are many promising use-cases: 

Business Modeling

  • Help develop a growth model
  • Answer questions such as “which sub-metrics may affect this metric”
  • Help in other forms of business modeling, for example market sizing, idea impact estimations

Data Infrastructure 

  • Detect and clean bad data
  • Process and format data

Data Analysis

  • Generate visualizations and reports
  • Answer data questions
  • Summarize textual data
  • Detect patterns 
  • Suggest insights

Product Discovery

  • Generate fake data for testing
  • Design and code prototypes
  • Analyze experiment results
  • Find past supporting evidence
  • Help estimate idea impact, cost, confidence

Work is already in progress. For example, some analytics systems will already let you create charts or query data using AI prompting. Some even detect unusual signals, make predictions, and suggest action, all powered by AI.  

We should proceed with caution. Machine learning models are not designed to be perfect stores of data or knowledge. This is particularly evident with Large Language Models (LLMs) that will sometimes confidently return false information, not unlike how our own minds sometime embellish or even fabricate memories. So any data solution will likely have to combine classic data infrastructure with a layer of AI. 

Common forms of machine-learning

Also, the quality of a machine learning model heavily depends on its training data. In cases where there aren’t many good examples, for example business modeling or metrics-driven product decisions, the model may generate bad advice. 

Still, I’m hopeful that with enough time, AI will help reduce the friction and let us bridge the gaps I mentioned in this article. More organizations will model their businesses better, become more data-driven, and practice real evidence-guided development. 


Even longer term, we can envision a future in which our AI systems will ingest all product, user, customer, and market data, and help us make better decisions on strategy, business, and product development. Among the many voices of people with opinions, there’s definitely room for the voice of the bot-with-data. 

Join my newsletter to get articles like this 
plus exclusive eBooks and templates

Share with a friend or colleague
Tired of Launching the Wrong Things?

Join my Lean Product Management workshops to level-up your PM skills and align managers and colleagues around high-impact products.