Evaluations

Evaluate the quality of your LLMs against policies such as toxicity or sensitive data using evaluators. Apply them to measure quality and security issues during development and in production.

Through the Policy Catalog, you can select and configure policies to monitor specific behaviors and issues for each application.

How evals and policies work together

Policies are configurable evaluation and guardrail mechanisms that help you detect and control security, safety, and quality risks in LLM-based applications.

In the Policy Catalog, choose the policies that match the behaviors you want to monitor for each application, then configure the evaluators that will score those policies. Once configured, evals are applied automatically to all spans streamed into the platform, without adding latency to the ingestion process or interfering with live requests.

Coralogix displays high scores as issues in the relevant Application dashboards. High and low scores also appear as labels for each AI span in AI Explorer.

Prebuilt policies

Apply ready-to-use evaluations for security, hallucination detection, toxicity, topics, user experience, and compliance. See Prebuilt Policies for the full list and setup instructions.

Custom policies

Alongside predefined policies, you can create your own policies based on user-defined criteria and use cases. Custom evaluation criteria and prompt templates let you measure what actually matters for your application — regulatory compliance, tone consistency, task completion accuracy, and more. See Custom Policies.

Billing

Customers are billed based on the number of active evaluations and the volume of LLM interactions (tokens). See Pricing for details.

Need help? Contact Support.

What's new? Find out here.

LLM? Read llms.txt.

Previous Guard API

Next Prebuilt Evaluation Policies