Evaluate the Quality of Your LLMs

Evaluate the quality of your LLMs against policies such as toxicity or sensitive data using evaluators. Apply them to measure quality and security issues during development.

How evals and policies work together

Policies are configurable evaluation and guardrail mechanisms that help you detect and control security, safety, and quality risks in LLM-based applications.

In the Policy Catalog, choose the policies that match the behaviors you want to monitor for each agent, then configure the evaluators that will score those policies. Once configured, evals are applied automatically to all spans streamed into the platform, without adding latency to the ingestion process or interfering with live requests. The system identifies issues and assigns a score to each result.

Coralogix displays high scores as issues in the relevant Application dashboards. High and low scores also appear as labels for each LLM call on the LLM Calls page.

Prebuilt policies

Coralogix provides prebuilt policies that you can apply to any AI application. Each evaluation includes configuration options, such as where to apply it (user prompt, LLM response, or both) and which categories to include or restrict.

For a full list of prebuilt policies and instructions on creating and managing evals for them, see Prebuilt Policies.

Custom policies

Alongside predefined policies, you can create your own policies based on user-defined criteria and use cases.

For instructions on creating and managing evals for custom policies, see Custom Policies.

Billing

Customers are billed based on the number of active evaluations and the volume of LLM interactions (tokens).

Find out more in Billing & Usage.

Need help? Contact Support.

What's new? Find out here.

LLM? Read llms.txt.

Previous Application Overview

Next Prebuilt Policies