Agent-First Data Platform: why we built Calliope

We've been saying "AI for data" for two years. I think we got the framing wrong.

Everything launched under that umbrella — text-to-SQL, dashboard chatbots, notebook-first agents — starts from the same assumption: the analytics layer is finished, you just have to stick an LLM on top. And it's wrong.

The modern analytics layer was designed for humans reading dashboards. Not for autonomous agents operating on data. And the difference is not cosmetic.

A Data Agent Platform is software infrastructure to deploy, equip, govern, and scale autonomous agents that interact with data to execute analytical, operational, and decision-making tasks. It's not "AI + data" — it's an architectural paradigm where agents are the primary unit of execution. That means the platform needs to create agents, connect them to data sources, equip them with tools, manage their state and memory, orchestrate multiple agents collaborating, govern their actions, and measure their cost.

A human opens a dashboard, reads an ambiguous number, pings an analyst on Slack, looks up a Confluence doc, applies judgment. An agent can't do any of that. An agent needs an explicit tool schema, hard guardrails that don't depend on human discipline, procedural memory that accumulates corrections, and per-invocation cost accounting. No traditional platform gives that out of the box. They have to bolt it on. And bolted-on agents are feature releases, not architecture.

That's why we're building Calliope Data as the first Data Agent First Platform — a category we believe will define the next decade of analytical infrastructure. The foundational layers already work. The upper layers are where we're headed.

How we got here

We didn't start by saying "let's create a category." We started by building a product.

For a year we developed Calliope Marketing — a vertical specialization of Calliope Data for marketing teams. An agent that could answer questions about attribution, CAC, LTV, conversion, using real company data, with semantic ontology, business rules, and cost accounting. A real agent, in production, on real data.

In that process we discovered something: the vertical is extremelly valuable. But the platform underneath can make all the diference The ontology, the pipeline, the rules, the cost accounting — everything we built for a marketing agent to operate with rigor was reusable for any agent operating on any data. Finance, operations, product, support — same substrate, same guarantees, different domain.

That was the inflection point. Calliope Marketing validated the thesis; now the platform generalizes it.

And there's a design principle that came out of that experience: independence. Calliope Data is agnostic to AI models and cloud providers. You're not locked into OpenAI, Anthropic, or any hyperscaler. If a better or cheaper model drops tomorrow, you just switch. If your company can't send data to a specific cloud, it doesn't have to. Independence isn't a feature — it's an architectural principle that protects the customer from obsolescence in a market that changes every six months.

What "Agent-First" actually means

A Agent-First Data Platform is not a database with a chatbot bolted on top. It's infrastructure where agents are the primary unit of execution and every layer exists to support their lifecycle:

Connect: ingestion produces a stable substrate — versioned snapshots the agent operates on without race conditions.
Equip: the semantic layer emits a tool schema the agent consults before every decision.
Manage: validated examples and conversation state are the agent's procedural memory.
Govern: business rules are hard guardrails that travel with every query — the agent can't bypass them.
Measure: observability breaks down cost per agent, not just per model or per user.
Create and Orchestrate: the ability to define custom agents and coordinate multiple specialized ones — where we're building toward.

All of the foundational layers exist in a traditional analytical stack too — but they were designed for the wrong consumer. dbt's semantic layer was built for BI tools. Data catalogs were built for data stewards. Cost dashboards were built for FinOps teams. None of them treat the agent as a first-class citizen, because all of them were built before agents were a legitimate architectural question.

Calliope was built after. That's why the framing shifts.

The four layers, seen from the agent's perspective

1. Substrate: a stable world to read

Agents need to read data that doesn't shift under their feet mid-conversation. Calliope's pipeline has five explicit phases:

The first three phases are classic ingestion. The interesting ones are the last two: generate the ontology and generate the examples. In most stacks, the pipeline ends at "data loaded" and the semantic layer is a side project someone maintains by hand. In Calliope, the semantic layer is part of the pipeline. Every time new data arrives, the agent's context refreshes automatically — and the manual edits humans made are preserved.

2. Semantics: the ontology as tool schema

A generic LLM doesn't know that customer_revenue excludes returns at your company. It can't infer it from the column name. It can't infer it from the data type. Somebody has to tell it explicitly.

In Calliope, the ontology is exactly that: semantic descriptions of tables and columns, AI-generated but human-editable. And here's the design decision that matters: when the ontology regenerates, human edits are preserved. The LLM refreshes what the LLM wrote; humans keep what humans know. The product even states it verbatim in the UI: "Regenerates the semantic description of tables and fields using AI. Useful when data sources have been modified."

That's the agent's tool schema. Before generating any query, the agent reads what each column means, which joins are valid, which tables are canonical. Hallucinations drop to the floor.

3. Governance: rules as guardrails

In most analytical stacks, business rules live in a wiki, a Notion doc, or the head of a senior analyst who's leaving in 18 months. They're applied by human discipline. When an agent shows up, human discipline is the first thing that breaks.

In Calliope, business rules are versioned, toggleable, categorized objects with markdown descriptions. They're applied to every query the agent runs — not to specific dashboards. The product hero says it verbatim: "Query {tables} tables with {rules} business rules applied." The rule isn't optional. The agent can't skip it even if it wants to.

That's the guardrail. Governance that doesn't depend on discipline.

4. Agent economics: native cost accounting

LLM cost is the first resource you'll have to justify when you put agents at scale in front of enterprise data. But the conventional framing — "cost per request" or "cost per model" — doesn't cut it. What matters is cost per agent per useful answer.

Calliope ships an LLM usage dashboard that breaks down cost by model, user, and agent. Not as a bolted-on observability tab; as a product requirement. The reason: 18 months from now, when you have 15 custom agents running on your data — one for finance, one for marketing, one for ops — the question isn't going to be "how much did Claude cost us this month?". It's going to be "does the finance agent have positive ROI, or is it burning tokens exploring things nobody uses?".

That metric — cost per agent — is going to be this category's unit economics. Whoever measures it first wins.

Where we are and where we're going

A complete Data Agent Platform covers seven capabilities: create agents, connect them to data, equip them with tools, manage their state, orchestrate several collaborating, govern their actions, and measure their cost. An honest manifesto says which ones already work and which are the direction.

What already works (foundational layers): data connection via a 5-phase pipeline, ontology as tool schema, validated examples as procedural memory, business rules as query-time guardrails, and per-agent cost accounting. These layers are in production.

What's coming (upper layers):

An MCP server and a tool-calling API so any agent — Claude, Cursor, your custom one — can use Calliope's ontology, rules, and examples as tools. Today the assistant is the primary surface. Tomorrow, any agent can operate with the same guarantees.
Custom agents per organization: "ACME's finance agent has this system prompt, these fixed examples, these extra rules." Creating specialized agents from within the platform.
Multi-agent orchestration: a planner agent that invokes specialized ones and consolidates. Today it's a single linear conversation.
Agent-centric audit trail with captured intent: what was asked, what SQL was generated, which rules applied, which example was used, how much it cost — all in one view.

We don't treat these as weaknesses. We treat them as the explicit direction of the product. The architecture is already designed to receive them without rewrites.

Why this matters now

There's a window. Over the next 12 months, every major analytical stack is going to ship their version of "AI agent". Snowflake will ship Cortex Analyst. Databricks will expand Genie. ThoughtSpot, Mode, Hex — every one of them will have theirs. And they'll all share the same flaw: they'll be bolt-ons on top of architectures designed before agents were a real question.

We think the opposite is the opportunity: build from day one with the agent in mind, and let the rest of the platform come out of that constraint. If we're right, the term "Data Agent First" will start appearing in other people's pitches in six months. If we're right, in 18 months it'll be a Gartner category.

If we're wrong, at least we'll have built something that makes it much easier for a non-technical analyst to ask their data a question without opening a ticket.

What's next

This is the first of six articles. Over the coming weeks we'll go layer by layer:

This article — the manifesto.
The substrate agents need — how raw data reaches a stable, agent-readable state.
Your ontology is the agent's tool schema — why naive text-to-SQL fails and how we solve it.
Business rules as agent guardrails — governance that doesn't depend on human discipline.
Agent memory — validated examples and streaming as a feedback loop.
Agent accounting — LLM cost as a product problem, not a FinOps problem.

If the thesis lands, contact us to show you more.