Back
Back

Agent-First Observability: Dynamic Data, High Cardinality, and the Business Impact

Agent-First Observability: Dynamic Data, High Cardinality, and the Business Impact

We want to transform how companies make decisions.

That is not what you expect to hear from an observability company. Observability tools are supposed to help you monitor systems, debug incidents, and maybe reduce downtime. Useful, but not exactly the foundation for business decision making.

So what does observability have to do with revenue, churn, or customer experience? 

More than you think, because observability already sits on top of the most important data in your business. The problem is that most platforms are architecturally incapable of letting you use it.

How to expose the gap 

The business cardinality test 

Ask any observability vendor the following:

“I run a video streaming service with hundreds of millions of users. I need to prove I met my SLA for each user. Not average latency, not system uptime. Actual per user experience. So I want to attach user_id to my metrics and send a billion time series every day. Can your system handle that?” 

The polite version is that this is not an observability use case. The honest version is that the system cannot handle it. 

However, when Postgres has high CPU or Redis goes out of memory, the questions that drive real decisions are not “is error rate above threshold?” They are:

  • Which specific customers were affected?
  • How much money is this costing us right now?
  • Do I charge that user $15 this month, or do I owe them a refund?

High-cardinality fields like user_id, tenant_id, and session_id create index bloat in legacy time series databases, so vendors ask you to drop them or pre-aggregate before ingestion. The result is that the dimensions that make data meaningful are exactly the ones the system is designed to discard.

There is a difference between bad cardinality (noise, redundant labels, low-signal dimensions),  and business cardinality, the fields that connect system behavior to real users and real revenue. Legacy platforms cannot tell the difference. Coralogix is built to handle both, currently supporting customers sending over two billion time series per day for per-user SLA monitoring.

The “unknown unknowns” 

Here is the second test. Ask any vendor: “Can you help me uncover unknown unknowns?”

Every one of them will say yes.

Then ask them how. They will tell you to define your schemas, choose your facets, and decide in advance which fields should be indexed and searchable.

What they are really asking you to do is sit down today and predict every question you will need to ask tomorrow when a crisis hits. If you can define the dimensions in advance, the problem is not unknown. At best, it is a known unknown, you haven’t seen it yet, but you know roughly where to look.

Real failures do not behave like that. They come from interactions nobody modeled: a new service introducing a field that was never indexed, a payload changing shape mid-incident, a combination of conditions that nobody anticipated. The moment something like that happens, a schema-bound system stops working for you. If the field was not indexed, it does not exist. If it does not exist, you cannot search it, aggregate it, or correlate it. The data is there. The system just cannot see it.

That is not a missing feature. It is a structural limitation baked into how the data is stored.

The shape of data defines the limits of reasoning

When you force structure at ingestion, by defining schemas, choosing facets, pre-indexing fields, you are making a bet on every question you will ever need to ask. 

That bet does not just limit what engineers can query; it limits what any system can reason about.

If a field was not indexed yesterday, it is effectively invisible today. Neither a human nor an AI system can access it, correlate it, or use it to explain what happened. Intelligence is capped by what the system chooses to store and expose.

Coralogix takes the opposite approach. Every field, every label, every attribute, including business dimensions like cart_value, plan_tier, and transaction_id, is stored raw and made immediately searchable without predefined schemas. Structure is applied at query time, not ingestion. That means any field can be joined with any other field, at the moment you need it, against the original event data.

Think of it as a continuous breadcrumb trail. Individually, each log line, span attribute, or metric label is a small signal. Together, they form the path that explains what actually happened. Most platforms force you to define that path before the incident occurs. We let you follow it wherever it leads.

AI doesn’t change this, it exposes it

AI agents are now part of the observability workflow. They investigate incidents, correlate signals, and surface recommendations. At the same time, an LLM is capped by what it can see. It can only reason about what the schema allows it to access, or else it hallucinates to fill the gaps.

The two tests make this concrete: if your platform cannot handle per-user cardinality, the AI cannot reason about user-level impact. If fields were not predefined, the AI cannot explore them  because it is confined to the same boundaries as the schema. Trusting anyone to pick the right facets or tables in advance does not just limit your engineers, it also blinds your AI. Feeding an AI agent through a schema-constrained system is not agent-first observability.

For an AI agent to actually be useful, it needs what any good engineer needs during an incident: access to everything. Every field. Every dimension. Every signal, joinable across infrastructure and business context in real time.

Olly: Reasoning on the full picture 

If every field is available, every dimension can be joined, and no schema limits what you can query, then the next step is not another dashboard, it’s a system that can reason on top of that data.

So Coralogix introduced Olly. Olly is not an assistant layered on top of observability. It is the first AI agent that operates natively on a full, schema-free data surface, reasoning across logs, metrics, traces, and business context simultaneously, without predefined structure limiting what it can see.

Consider a checkout failure in an ecommerce system.

An engineer investigating manually would need to:

  • Identify a spike in 5xx errors in the payment service
  • Trace failed requests across checkout, payment gateway, and downstream dependencies
  • Join those failures with user_id, session_id, and transaction_id
  • Correlate with cart_value and customer segments to quantify revenue at risk
  • Track how the issue evolves across users and sessions over time

This is not a single query. It is a sequence of steps across different data types, dimensions, and levels of abstraction.

Olly collapses that sequence into a single reasoning flow.

  • It traverses the “breadcrumb” trail across logs, metrics, traces, and business data without requiring predefined schemas
  • It correlates failures with user-level and transaction-level context in real time
  • It surfaces which users are impacted, what those transactions are worth, and how that impact is evolving
  • It moves from identifying the failure to explaining what it means for the business

Olly is not constrained by yesterday’s schema decisions. It reasons on reality, not a filtered version of it.

Observability as a decision system

Ten years ago, data was called the new oil; collect everything and the value would follow. Then reality hit. Costs exploded, signal drowned in noise, and data became a liability: garbage in, garbage out.

Today, AI has changed the equation again. Data is valuable again, but only if you can actually use it. The bottleneck is no longer collection. It is access.

When observability data is dynamic, high-cardinality, schema-free, and instantly joinable across infrastructure and business context, it stops being a monitoring tool. It becomes the system that connects system behavior to user experience, performance to revenue, and incidents to decisions.

That is the shift we are building toward. Not more dashboards. Not better alerts. A different role for observability entirely: the primary decision layer of the organization.

On this page