The reinventing-the-wheel problem
Something funny is happening at companies where AI has made everyone a data analyst overnight. They're all 10x faster, but they're all moving 10x faster in slightly different directions. The sales team builds a pipeline report with an AI tool. Marketing builds their own version. Finance has a third. All three give different numbers. All three feel "right" because an AI generated them.
We're not just recreating the wheel. We're recreating it 10x faster, in 10 different shapes, and nobody realizes they're all solving the same problem differently until the board meeting where three people present three different revenue numbers.
Nobody talks about this because it's boring. Orchestration doesn't get a TechCrunch headline. But it's the thing that actually breaks companies.
When a data team used to be the bottleneck, at least there was a single source of truth. The analyst was slow, sure - but when they produced a number, it was THE number. Now that AI has removed that bottleneck, what we've gained in speed we've lost in consistency.
The "it worked, so it must be right" trap
AI is incredibly good at sounding right. It never hedges. It never says "well, it depends on how you define active user" the way a data analyst would. (Data analysts love saying that. We can't help it. Because it always depends.)
Is that "revenue" number ARR or MRR? Does it include expansion? Are those "active users" daily or monthly? Does "pipeline" mean created this quarter or currently open? These aren't pedantic distinctions - they're the difference between a company that's growing and a company that thinks it's growing.
When a sales rep asks an AI tool "what's our pipeline?" and gets $4.2M, they take it at face value. They don't ask whether that includes partner-sourced deals, whether it's weighted or unweighted, or whether deals past due have been excluded. The AI gave them a number, it looks reasonable, and they move on.
Meanwhile, the actual pipeline - by the definition the board uses - is $2.8M. That's a fun one to discover in the board meeting. Really livens up the room.
Why documentation and semantic layers aren't enough
Whenever I bring this up, someone says "just use a semantic layer" like it's a magic spell. And look, semantic layers help. dbt's semantic layer, Looker's LookML, well-maintained data dictionaries - all good things.
But documentation has a fatal flaw: it can only cover the problems you've already thought of.
The most dangerous data issues aren't the ones you've already documented. They're the ones that arise from a unique combination of filters, date ranges, and business context that nobody anticipated. A new product launch changes how you count "active." A pricing change means historical revenue comparisons are misleading. A Salesforce field that was reliable for three years suddenly has garbage data because someone changed a validation rule.
These aren't documentation failures. They're the kind of messy, contextual problems that require a human who knows the business, knows the data, and can recognize when something doesn't look right - even when the numbers technically compute.
You need observability on the outputs, not just the pipelines
I think we're heading toward a world where data observability isn't just about monitoring pipelines for freshness and schema changes. It's about monitoring the outputs - the reports, dashboards, and AI-generated analyses that people are using to make decisions.
Say your sales VP asks an AI assistant "how did we do last quarter?" The AI generates a response using your data warehouse. Before that response reaches the VP, an observability layer checks:
- Is the underlying data fresh? Or is the pipeline stale and showing numbers from two days ago?
- Do the metric definitions match your canonical definitions? Or did the AI interpret "revenue" differently than your finance team does?
- Are there known data quality issues? Maybe the product usage data has been unreliable since last Tuesday's deploy.
- Has this question been answered differently by the data team? If so, flag the discrepancy.
This doesn't exist yet - at least not in a way that's practical and lightweight. But it needs to. As AI makes it trivially easy for anyone in the company to query data, the governance layer has to become equally automated and equally fast.
What we actually need
The answer isn't less AI. Obviously. You can't put that genie back. But someone needs to be the air traffic controller. Here's what I'd actually build if I had a magic wand and a free weekend:
1. A canonical metric registry
Not a data dictionary that lives in Notion and gets updated quarterly. A machine-readable registry of every metric that matters - its definition, its SQL, its caveats, and its owner. When an AI tool generates a report, it should check this registry. If the metric it produced doesn't match the canonical version, flag it.
2. Report-level observability
We monitor data pipelines for freshness, volume, and schema. We should also monitor dashboards and AI-generated reports for drift. If a number that's been stable at $3M for six months suddenly jumps to $5M in someone's AI-generated analysis, that should trigger an alert, not a board slide.
3. Human-in-the-loop for high-stakes decisions
Let the AI handle 80% of ad hoc questions. But for anything that goes to the board, into a sales forecast, or drives a budget decision, a human data professional should validate it. Not because AI is unreliable - but because the cost of being wrong on a board metric is way higher than the cost of waiting a day for a human to check.
4. Centralized governance, decentralized access
The goal isn't to put the data team back as a bottleneck. It's to let everyone query and explore freely while maintaining a governed layer that ensures consistency. Anyone can drive on the road, but there are still traffic lights and lane markings. AI-powered analytics needs the same thing.
Why this matters right now
AI tooling for data is getting better every month. Within a year, most companies will have some form of "ask your data anything" baked into their stack. That's exciting. Faster decisions, more experimentation, less waiting for the data team to build a dashboard.
But without the governance and observability layer, we're going to see a wave of companies making confident decisions on bad data. Not because the data is bad, but because the context around it was lost in translation.
The data team's job is shifting. Less "build this dashboard" and more "make sure every number in the company can be trusted." Harder job. More important one.
Everybody's got a dashboard now. The question is whether any two of them agree.