LiteLLM

Coralogix's AI Observability integrations are designed to provide deep insight into applications leveraging large language models. Through a dedicated integration with the LiteLLM SDK, Coralogix delivers a unified view of calls across various LLM providers, enabling teams to track performance, costs, and errors in a single place. This helps teams standardize monitoring, compare model performance, and optimize the entire system for efficiency and accuracy.

Overview

This library offers customized instrumentation for the LiteLLM SDK, optimized to support the development of production-ready applications. It provides streamlined integration and offers detailed visibility into LLM calls through LiteLLM's unified interface. This enables effective debugging, performance analysis, and a clear understanding of all LLM interactions, regardless of the underlying provider (OpenAI, Azure, Anthropic, Cohere, etc.).

Note

Instrumentation of async completion (acompletion) calls is not possible due to technical issues within the LiteLLM SDK.

Requirements

Python version 3.9 and above.
Coralogix API keys.

Note

Installation using uv on the Windows platform is not supported due to technical reasons.

Installation

Run the following command:

pip install llm-tracekit[litellm]

Authentication

Authentication data is passed during the instrumentor object's creation. You can do this in one of two ways:

Passing arguments to the constructor

You can pass the token, endpoint, and other parameters directly when initializing the LiteLLMInstrumentor.

from llm_tracekit import LiteLLMInstrumentor

instrumentor = LiteLLMInstrumentor(
    coralogix_token="<your_coralogix_token>",
    coralogix_endpoint="<your_coralogix_endpoint>",
    application_name="<ai-application>",
    subsystem_name="<ai-subsystem>"
)

Using environment variables

If arguments are not passed to the constructor, the instrumentor will automatically use the following environment variables:

CX_TOKEN: Your Coralogix API key
CX_ENDPOINT: The endpoint associated with your Coralogix domain
CX_APPLICATION_NAME: Your application's name
CX_SUBSYSTEM_NAME: Your subsystem's name

Usage

This section describes how to set up instrumentation for the LiteLLM SDK.

Set up tracing

Instrument

To instrument all clients, create an instance of LiteLLMInstrumentor and call the instrument method.

from llm_tracekit import LiteLLMInstrumentor

# Arguments can be passed here or set as environment variables
instrumentor = LiteLLMInstrumentor()
instrumentor.instrument()

Uninstrument

To uninstrument clients, call the uninstrument method.

instrumentor.uninstrument()

Full example

import litellm
from llm_tracekit import LiteLLMInstrumentor

# Activate instrumentation
# Coralogix connection details are read from environment variables:
# - CX_TOKEN
# - CX_ENDPOINT
# - CX_APPLICATION_NAME
# - CX_SUBSYSTEM_NAME
instrumentor = LiteLLMInstrumentor()
instrumentor.instrument()

# LiteLLM SDK Usage Example
response = litellm.completion(
    model="gpt-4o-mini",
    messages=[{"content": "What is the capital of Italy?", "role": "user"}]
)

print(response)

Enable message content capture

By default, message content, such as the contents of the prompt, completion, function arguments and return values, are not captured.

To capture message content as span attributes, set the environment variable OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT to true.

Most Coralogix AI evaluations require message contents to function properly, so enabling message capture is strongly recommended.

Key differences from OpenTelemetry

User prompts and model responses are captured as span attributes instead of log events, as detailed below.

Semantic conventions

Attribute	Type	Description	Example
`gen_ai.operation.name`	string	The specific name of the operation being performed	`chat`
`gen_ai.system`	string	The provider or framework responsible for the operation	`openai`, `anthropic`, `cohere`
`gen_ai.request.model`	string	The name of the model that the user or application requested	`gpt-4o-mini`
`gen_ai.request.temperature`	float	The 'temperature' parameter passed in the request. It controls the randomness of the output: higher values (e.g., 1.0) make the output more random, while lower values make it more deterministic.	`1.0`
`gen_ai.request.top_p`	float	The 'top_p' parameter used for nucleus sampling. The model considers only the tokens comprising the top 'p' probability mass, serving as an alternative to temperature for controlling randomness.	`1.0`
`gen_ai.prompt.<message_number>.role`	string	Role of message author for user message	`system`, `user`, `assistant`, `tool`
`gen_ai.prompt.<message_number>.content`	string	Contents of user message	`What's the weather in Paris?`
`gen_ai.prompt.<message_number>.tool_calls.<tool_call_number>.id`	string	ID of tool call in user message	`call_yPIxaozNPCSp1tJ34Hsbdtzg`
`gen_ai.prompt.<message_number>.tool_calls.<tool_call_number>.type`	string	Type of tool call in user message	`function`
`gen_ai.prompt.<message_number>.tool_calls.<tool_call_number>.function.name`	string	The name of the function used in tool call within user message	`get_current_weather`
`gen_ai.prompt.<message_number>.tool_calls.<tool_call_number>.function.arguments`	string	Arguments passed to the function used in tool call within user message	`{"location": "Paris"}`
`gen_ai.prompt.<message_number>.tool_call_id`	string	Tool call ID in user message	`call_mszuSIzqtI65i1wAUOE8w5H4`
`gen_ai.completion.<choice_number>.role`	string	Role of message author for choice in model response	`assistant`
`gen_ai.completion.<choice_number>.finish_reason`	string	Finish reason for choice in model response	`stop`, `tool_calls`, `error`
`gen_ai.completion.<choice_number>.content`	string	Contents of choice in model response	`The weather in Paris is rainy and overcast, with temperatures around 57°F`
`gen_ai.completion.<choice_number>.tool_calls.<tool_call_number >.id`	string	ID of tool call in choice	`call_O8NOz8VlxosSASEsOY7LDUcP`
`gen_ai.completion.<choice_number>.tool_calls.<tool_call_number >.type`	string	Type of tool call in choice	`function`
`gen_ai.completion.<choice_number>.tool_calls.<tool_call_number >.function.name`	string	The name of the function used in tool call within choice	`get_current_weather`
`gen_ai.completion.<choice_number>.tool_calls.<tool_call_number >.function.arguments`	string	Arguments passed to the function used in tool call within choice	`{"location": "Paris"}`
`gen_ai.response.model`	string	The exact name of the model that produced the response.	`gpt-4o-mini-2024-07-18`
`gen_ai.response.id`	string	A unique identifier assigned to the specific completion	`chatcmpl-CEaLMZn6bfTEOKFumw5IdMFiZ657a`
`gen_ai.usage.input_tokens`	int	The number of tokens consumed by the prompt sent to model	`66`
`gen_ai.usage.output_tokens`	int	The number of tokens generated in the model response	`44`

Previous Gemini

Next OpenAI