CIO CISO CFOBoardDatabricksOpenTelemetryUnity CatalogMLflowAI AgentsAgent GovernanceObservabilityAI GovernanceFinOpsEnterprise AIData Platform

Databricks turns AI-agent traces into governed data

Joachim Høgby

22. mai 202622. mai 20265 min lesingKilde: Databricks

Del

LinkedIn X Facebook E-post WhatsApp Telegram

Databricks is moving AI-agent observability from a side log into governed enterprise data.

On May 22, the company said Databricks now supports writing OpenTelemetry traces directly into Unity Catalog. The technical wording matters less than the operational shift: traces from agent work can be stored as Delta tables, governed with the same access controls as other data, queried with SQL, and reused for evaluation and monitoring.

For CIOs, CISOs and boards, this is more important than another chatbot feature. Once agents get access to tools, documents and production systems, the trace of what they actually did becomes part of the control environment. Who asked what? Which tools were called? Which data was touched? How long did it take? What did it cost? Where did the agent fail? Which real interactions should become evaluation data, and which must be masked or deleted?

Databricks is trying to make those questions part of the data platform, not a separate observability silo.

Agent logs become control data

Databricks frames the problem clearly: AI agents produce large amounts of trace data. Those traces can include prompts, tool calls, responses, latency and the full execution path through a task. Traditional observability tools are strong for operational signals, but Databricks argues they become expensive and awkward when traces need long retention, analytics reuse and governance as sensitive data.

The new support lets teams write OpenTelemetry data directly into Unity Catalog through a managed serverless ingestion path. The data lands in Delta tables. From there it can be used in SQL, dashboards, ETL, MLflow evaluation, MLflow monitoring and Databricks Genie.

The leadership point is simple: production behavior from AI agents can become as auditable as other business data. That is not only useful for developers debugging a workflow. It matters for security, compliance, finance and vendor governance.

Why this matters in production

Many agent projects stall between demo and production. Not because the model cannot answer, but because the business cannot see enough of what the agent is doing. Logging becomes fragmented. Costs are unclear. Evaluation data sits in one tool, security logs in another and business outcomes in a third.

Databricks is targeting that exact gap. The company highlights three practical benefits:

Traces can be written in real time and retained longer without the same cost pressure as pure SaaS observability models.
Traces can be governed through Unity Catalog, including access controls, column masking and row-level filtering.
Traces can be joined with business data, model costs and evaluation results.

The third point is often underrated. An agent is not good simply because it responds quickly. It must solve the right task, use the right tools, stay within policy and create measurable value. If traces can be joined with conversion, case resolution time, customer satisfaction, risk or incidents, leadership can discuss quality with data instead of anecdotes.

OpenTelemetry as a shared language

Databricks uses OpenTelemetry as the format. That matters because it separates instrumentation from storage. Companies can instrument agents and applications with a standard format while choosing where the traces land.

In the Databricks architecture, traces, logs and metrics flow into Unity Catalog tables. The ingestion layer is, according to the company, powered by Zerobus Ingest and supports OpenTelemetry protocols through gRPC and REST. Databricks says this avoids a separate chain of Kafka, staging layers and custom data pipelines before traces can be analyzed.

The blog also shows an example with a LangGraph agent running outside Databricks, using a Databricks-hosted Claude Sonnet 4.6 model and calling Genie through MCP for SQL analysis. The exact demo is not the core point. The important point is that traces can be centralized even when agents run in different places.

That maps to a real enterprise problem. AI agents will not live in one neat system. They will show up in customer service, developer tools, finance, HR, analytics and core business applications. Shared trace formats, data ownership and clear access rights will become basic operating requirements.

FinOps meets security

Databricks also points to cost governance. When traces are stored as tables, companies can analyze token use, latency, error rates and model choices. The blog shows how teams can build cost dashboards that use negotiated contract pricing instead of generic list prices.

That is a CFO issue. AI cost quickly becomes invisible when it is hidden inside agent chains. One user task can trigger retrieval, several model calls and multiple tool calls. Without traces at the right level, it is hard to know what a process actually costs.

It is also a CISO issue. Traces can contain raw prompts, responses and sensitive data. If those traces are sent casually to third-party observability tools, companies create new processor, access and data-sovereignty problems. If traces are governed in the same data platform as other sensitive data, policy enforcement becomes more manageable. Not automatically safe, but more controllable.

What leaders should do now

This is not a reason to buy another platform blindly. It is a signal about what must exist before agents get real authority.

The CIO should require an agent-tracing architecture before the next major rollout. Which events are logged? How long are they retained? Can traces be joined to business outcomes? Can they be reused for regression testing when a model, prompt or tool changes?

The CISO should ask the harder questions. Which prompts and responses can contain personal data, trade secrets or customer data? Who can read traces? Is data masked before analysis? Can traces support incident response? Is there an audit log for who searches through them?

The CFO should get cost per process, not only total model spend. If an agent is meant to replace manual work, the company needs to measure both quality and cost at task level.

And the board should ask for one thing: the agent’s actions must be reviewable. Not only the model’s final answer.

The Databricks announcement shows where enterprise AI is heading. The value is not only in the model. It is in the operating system around the model: traces, access, evaluation, cost control, rollback and continuous improvement. Boring infrastructure. Exactly why it matters.

Sources and media

Primary source: Databricks, “Observability for any agent, anywhere: Production-ready tracing with OpenTelemetry & Unity Catalog on Databricks”, published May 22, 2026: https://www.databricks.com/blog/observability-any-agent-anywhere-production-ready-tracing-opentelemetry-unity-catalog
Source credit: Databricks.
Thumbnail: OpenAI Image 2 / hogby.ai.

📬 Likte du denne?

AI-nyheter for ledere. Kuratert av en CIO som bygger det selv. Daglig i innboksen.

Relaterte saker

Anthropic gjør Claude Opus 5 til ny toppmodell for agentarbeid

Breaking

AI-modellerAnthropicClaude

Anthropic gjør Claude Opus 5 til ny toppmodell for agentarbeid

Claude Opus 5 flytter Anthropic-kampen fra ren intelligens til styrbar kost, fart og sikkerhet i agentarbeid. Det er en tydelig CIO-sak, ikke bare en modellnyhet.

24. juli 20265 min lesing

Anthropic

Åpne saken

CIOCISOCTO

GitHub ruller Claude Opus 5 inn i Copilot for agentisk koding

Claude Opus 5 er tilgjengelig i GitHub Copilot for Pro+, Max, Business og Enterprise. GitHub fremhever agentiske kodeflyter, egenverifisering og strengere cyber-sperrer. For IT-ledere blir modellvalg i Copilot et spørsmål om styring, kostnad og sikkerhet – ikke bare autocomplete.

24. juli 20265 min lesing

GitHub

Åpne saken

AI-modellerGoogle AIGemini

Google gjør Gemini Flash raskere for agentarbeid

Google lanserer Gemini 3.6 Flash og 3.5 Flash-Lite med tydeligere fokus på hastighet, token-effektivitet og produksjonsklare AI-agenter.

24. juli 20264 min lesing

Google AI

Åpne saken