AI Center Overview
The AI Center Overview dashboard provides a centralized, data-driven observability hub for all your AI applications. Through the Overview dashboard, you will gain a holistic view of AI app health, performance, cost, quality issues, and security stance. This enhances decision-making and accelerates issue resolution, leading to improved performance and greater reliability across your entire AI library.
Accessing the Overview dashboard
- In the Coralogix UI, navigate to AI Center > Overview.
- Use the time picker to select the desired time interval for metrics collection.
- Continue your journey by scrolling to the relevant Overview section.
Summary
The Summary section aggregates key information about the performance of all your AI apps, displaying crucial counters that collect data from the spans. Additionally, the bar chart offers daily insights into your prompt and response data at a glance.
Counters
The counters offer a comprehensive snapshot of key performance, usage, and cost metrics:
- Models Used – A list of models used in the applications.
- Error Rate – Percentage of traces with errors across all applications.
- Time to Response – The time taken by the LLM to respond to the user. You can filter the metrics by average, P90, P95 or P99 percentiles.
- Estimated Cost – Cost calculation based on token usage and model costs.
- Token Usage – The total token usage across all applications.
- Issue Rate – The percentage of prompts and responses with issues out of the total.
LLM calls
Analyze patterns and trends in LLM call (prompt and response) issues to identify recurring challenges and potential areas for improvement over time. You can hover over the graph to view the data distribution for that specific time point.
- Total LLM Calls – The total LLM call (prompts and responses) across all applications.
- Prompt Issues – The total number of prompts that contain issues. For example, if a single prompt contains three different issues, it should still be counted as one issued prompt.
- Response Issues – The total number of responses that contain issues.
Issues
Detect and mitigate potential risks by investigating overall message security and quality. An issue refers to a prompt/response pair where the evaluation score surpasses the pre-defined threshold, indicating a problem. Issues with low scores are excluded.
- Issue Rate – The percentage of prompts and responses with issues compared to the total number of user and app messages.
- Issue Distribution – The percentage for Security issues (reported by all evaluations in the Security category) and Quality issues (reported by all evaluations (hallucination, toxicity, compliance, etc.), excluding those in the Security category), calculated based on the total issue rate.
- Top 3 Applications with Issues – The 3 applications with the highest issue-to-total message ratio, along with the distribution of clean messages versus issues.
Cost
Gain a clear overview of token usage and associated costs. It helps you compare app expenses, identifies high-cost apps, and highlights users with significant spending, allowing for optimization or further investigation:
- Total Cost – The overall expenditure, accompanied by a graph illustrating cost trends over time.
- Total Tokens – The cumulative token usage, with a corresponding graph tracking token consumption over time.
- Top 3 Most Expensive Applications – The 3 apps with the highest costs, ranked primarily by cost and secondarily by token usage.
- Top 3 High-Spending Users – The top 3 users based on spending, ranked by cost first, followed by token usage.
Errors
Monitor error traces across all your apps, highlighting error counts and trends over time.
The Errors section uses counters and time series graphs to organize AI app errors.
- Error Rate – The percentage of errored traces to the total number of traces.
- Errored Traces – The total number of traces with errors.
- Total Traces – The total number of traces.
- Errors Over Time – The error trend over time.
- Top 3 Errored Spans – The 3 spans that have the highest number of errors.
Latency
Investigate latency data to assess the delays between a user's request initiation and the LLM's output.
- Time to Response – The average time taken by the LLM to respond to the user. Select average, P75, P90, P95 or P99 percentile.
- Time to Response Trends – The pattern in the amount of time it takes for the LLM to respond to a request over a specific period. The trend is segmented by average, P75, P95, and P99 percentiles.
- Top 3 Slowest Applications – The 3 applications with the highest time to response value, ranked by average, P75, P95, and P99.