Monitor your applications

Monitoring is one of four capabilities in AI Center — a complete platform for monitoring AI applications alongside Guardrails, Evaluations, and AI SPM. Use Monitoring to track the health, performance, cost, quality, and security of every AI application across your organization, and drill from an org-wide signal to the exact interaction that caused it.

AI Center is built around a top-down investigation flow: start with an organization-wide view, identify which application needs attention, then drill into that application down to the span level. This is the fastest path from "something looks wrong" to root cause.

How the views connect

AI Center has three complementary views that work together:
View What it shows When to use it
Overview Organization-wide metrics: trends, issues, cost, errors, and latency across all AI applications Start here to spot what's changing
Application Catalog All applications in one sortable table Use it to compare KPIs and identify which app to investigate
Application Drilldown Deep metrics for a single application Use it to understand errors, cost, latency, issues, and tool usage for one app

The typical flow:

Overview — Open the Overview to see the health of every LLM-based application across your entire team in one place. Spot what's changing: issue rates, cost spikes, latency trends.
Application Catalog — Sort by any KPI to find the application that needs attention.
Application Drilldown — Inspect errors, latency, and issues for that application.
AI Explorer — Find the specific span-level interaction. Navigate from the application level to the exact prompt-response pair that caused the problem.

Navigate to AI Center

In the Coralogix UI, select AI Center, then Overview.
Use the time picker to set the time range.
Review the team-wide metrics in the Overview.
Open the Application Catalog to compare applications across your team and select one to investigate.
In the Application Drilldown, review errors, latency, cost, issues, and tool calls for that application.

Overview

The Overview gives you a cross-application snapshot — one place to see whether anything is trending in the wrong direction across your entire AI portfolio.

Summary

Start here to understand overall volume and whether reliability, issue rates, or spend are changing. The Key Insights panel shows a snapshot of every key metric across all your AI applications. If your applications use guardrails, this section also shows how often guardrail actions are triggered.

Models Used — All models in use across applications.
Time to Response — LLM response time (average, P90, P95, P99).
Token Usage — Total tokens consumed across all applications.
Estimated Cost — Total spend based on token usage and model costs.
Error Rate — Percentage of traces with errors across all applications.
Issue Rate — Percentage of prompts and responses flagged with issues.
Guardrail Actions — Percentage of AI spans where a guardrail policy was triggered. Visible when at least one application is connected to the Guardrails SDK.
Total AI Spans — Total number of AI spans across all applications.
Unique Users — Number of distinct users who made requests.

The Activity section below Key Insights shows Total AI Spans, Issue Rate, and Guardrail Actions as trend cards, alongside an Issues Over Time chart that overlays total spans, evaluation issues, and guardrail issues on a single timeline.

The AI Center Overview Key Insights panel gives a complete snapshot of your AI portfolio — including guardrail action rates and unique user counts — with an Issues Over Time chart that separates evaluation issues from guardrail issues.

Issues

AI introduces a new class of problems that traditional observability cannot detect. An LLM can respond with a 200 OK and still return a hallucination, leak sensitive data, generate toxic content, or be manipulated by a prompt injection attack. None of these appear as errors in your logs.

AI Center surfaces these risks in real time through two complementary mechanisms:

Guardrails act in real time: intercepting, blocking, or modifying harmful outputs before they reach the user, not after.
Evaluations continuously assess every AI output against configurable quality and security policies — detecting hallucinations, compliance violations, toxicity, and more.

Together, they give you both prevention and detection — closing the gap between "something went wrong" and "it never reached the user."

Issues combine detected evaluation results and guardrail actions into a single view, so you see the full picture of what's wrong across your organization, regardless of whether it was caught by a policy or blocked by a guardrail.

Issue Rate — Whether the overall problem level is increasing.
Issue Distribution — Split between security issues and quality issues.
Top 3 Applications with Issues — Applications with the highest issue-to-message ratio.

Start with the Issue Rate to see the trend. Use Issue Distribution to identify whether the driver is security or quality. Drill into AI Explorer to inspect the specific span-based interactions.

The Issues section shows whether the overall problem level is trending up, how security and quality issues are distributed, and which applications are generating the most flagged interactions.

Cost

Compare application expenses, identify the primary cost drivers, and spot high-spending users for optimization or further investigation.

Total Cost — Overall spend, with trend over time.
Total Tokens — Cumulative token usage, with trend over time.
Top 3 Most Expensive Applications — Ranked by cost, then token usage.
Top 3 High-Spending Users — Ranked by cost, then token usage.

Start with Total Cost to confirm the spend direction, then use the top-3 breakdowns to locate the source.

The Cost section shows total spend and token usage over time, alongside the three most expensive applications and highest-spending users, so you can quickly locate where to focus optimization efforts.

Errors

Monitor error trends across all your applications before drilling down.

Error Rate — Percentage of errored traces.
Errored Traces — Total traces with errors.
Total Traces — Total traces processed.
Errors Over Time — Error trend over the selected period.
Top 3 Errored Spans — Spans with the highest error counts.

The Errors section tracks error rate trends over the selected period and surfaces the spans with the highest failure counts, giving you a starting point for diagnosing recurring failures.

Latency

Assess delays between user request and LLM output. Percentiles help you separate typical latency from tail latency that affects a smaller but important set of users.

Time to Response — LLM response time (average, P75, P90, P95, P99).
Time to Response Trends — Latency pattern over time, by percentile.
Top 3 Slowest Applications — Applications with the highest time to response.

Start with Time to Response to understand the baseline. Review trends and the slowest applications to identify where latency is increasing.

The Latency section breaks down response times by percentile and highlights the three slowest applications, making it easy to separate typical performance from tail latency that affects a subset of users.

Application Catalog

The Application Catalog gives you a side-by-side comparison of every AI application. Sort by any KPI — issue rate, cost, error rate, latency — to identify which application to investigate next.

Counters

Time to Response — Average LLM response time across all applications (average, P75, P95, P99).
Estimated Cost — Total estimated cost across all traces.
Token Usage — Total tokens consumed.
Issue Rate — Percentage of prompts and responses with issues, with trend vs. previous period.

The Application Catalog counters aggregate response time, cost, token usage, and issue rate across all your AI applications, giving you a single-line health check before comparing individual apps.

Application grid

Each row represents one application. Columns:

Application Name
Traces — Total LLM calls.
Security Issues — Issues flagged by security evaluations.
Quality Issues — Issues flagged by quality evaluations (hallucination, toxicity, compliance, etc.).
Cost — Estimated cost in USD.
Tokens — Total token consumption.
Avg. Duration — Average LLM response time.
Models — Models used in the application.

Select any row to open the Application Drilldown for that application.

The Application Catalog grid lists every AI application in one sortable table, with traces, security issues, quality issues, cost, tokens, latency, and models visible at a glance for easy comparison.

Application Drilldown

The Application Drilldown is where the investigation happens. Once you've identified a specific application in the Catalog, this view gives you the full picture: which spans are erroring, which models are slow, which users are spending the most, and what issues are being triggered.

Summary

The Key Insights panel at the top of the Application Drilldown shows a snapshot of every key metric for the selected application. If the application is connected to the Guardrails SDK, a This application is guarded banner appears and Guardrail Actions is included in Key Insights.

Models Used — Models used in this application's spans.
Time to Response — Average span response time (average, P75, P95, P99).
Token Usage — Total tokens processed.
Estimated Cost — Cost based on token usage and model pricing.
Error Rate — Percentage of errored traces, with trend vs. previous period.
Issue Rate — Percentage of LLM calls flagged with issues, with trend vs. previous period.
Guardrail Actions — Percentage of spans where a guardrail policy triggered. Only shown for guarded applications.
Total AI Spans — Total AI spans for this application.
Unique Users — Number of unique users who made requests.

The Activity section shows Total AI Spans, Issue Rate, and Guardrail Actions as trend cards alongside an Issues Over Time chart that separates evaluation issues from guardrail issues for this application.

The Application Drilldown Key Insights panel shows all performance and quality metrics for the selected application, including guardrail action rates and the "This application is guarded" status indicator when the Guardrails SDK is connected.

Issues

Visualize the security and quality issues affecting this specific application. AI-specific problems — hallucinations, prompt injections, policy violations, toxic outputs — don't appear as errors in traditional observability. This section makes them visible and actionable.

Issue Rate — Trend in issue rate over time.
Issue Distribution — Security vs. quality breakdown.
Top Issues — Most frequently triggered evaluations or guardrail policies.

Start with Issue Rate to see whether problems are increasing. Use Issue Distribution to identify the driver. Where guardrails are active, this section also shows how many issues were caught and handled in real time before reaching users. Drill into AI Explorer to inspect the specific interactions.

The Issues section for a single application shows the trend in flagged interactions, the split between security and quality problems, and which policies are triggering most frequently.

Errors

Error Rate — Percentage of errored traces, with trend vs. previous period.
Errored Traces — Total traces with errors.
Total Traces — Total traces.
Errors Over Time — Error trend over the selected period.
Top 3 Errored Spans — Spans with the highest error counts.

Start with Error Rate to understand impact. Use Top 3 Errored Spans to narrow down where failures originate.

The Errors section shows error rate, total trace volume, and an errors-over-time chart for the selected application, helping you understand whether failures are isolated or systemic.

Latency

Compare this application's latency against the organization average, and understand which models and spans are contributing to delays.

Span Latency — This application's average span time vs. organization average.
Top 3 Slowest Spans — Spans with the longest average duration.
Latency Over Time — Trend by average, P75, P90, P95.
Latency by Model — Per-model latency breakdown (average, P75, P95, P99).

The Latency section compares this application's response time against the organization average, identifies the slowest spans, and shows how latency has evolved over the selected period.

Cost

Identify which spans and users are driving spend for this application.

Total Cost — Overall spend with trend over time.
Total Tokens — Cumulative token usage with trend over time.
Top 3 Most Expensive Spans — Spans ranked by cost, then token usage.
Top 3 High-Spending Users — Users ranked by cost, then token usage.

The Cost section breaks down spend for the selected application by span and by user, making it straightforward to identify what is driving cost and whether it aligns with expected usage.

Tool calls

Understand how this application uses external tools, services, and systems to generate responses.

Number of Tool Calls per Interaction — Total tool calls, split into successful and failed.
Tool Distribution — Frequency breakdown by tool.
Total Usage in User-to-LLM Requests — How many user-to-LLM interactions triggered tool usage.

The Tool Calls section shows how many external tools this application invokes per interaction, which tools are used most often, and the ratio of successful to failed calls.

Need help? Contact Support.

What's new? Find out here.

LLM? Read llms.txt.

Previous Monitor AI Applications

Next AI Explorer

View	What it shows	When to use it
Overview	Organization-wide metrics: trends, issues, cost, errors, and latency across all AI applications	Start here to spot what's changing
Application Catalog	All applications in one sortable table	Use it to compare KPIs and identify which app to investigate
Application Drilldown	Deep metrics for a single application	Use it to understand errors, cost, latency, issues, and tool usage for one app