Skip to content

LiteLLM

Coralogix's AI Observability integrations are designed to provide deep insight into applications leveraging large language models. Through a dedicated integration with the LiteLLM SDK, Coralogix delivers a unified view of calls across various LLM providers, enabling teams to track performance, costs, and errors in a single place. This helps teams standardize monitoring, compare model performance, and optimize the entire system for efficiency and accuracy.

Overview

This library offers customized instrumentation for the LiteLLM SDK, optimized to support the development of production-ready applications. It provides streamlined integration and offers detailed visibility into LLM calls through LiteLLM's unified interface. This enables effective debugging, performance analysis, and a clear understanding of all LLM interactions, regardless of the underlying provider (OpenAI, Azure, Anthropic, Cohere, etc.).

Note

Instrumentation of async completion (acompletion) calls is not possible due to technical issues within the LiteLLM SDK.

Requirements

  • Python version 3.9 and above.
  • Coralogix API keys.

Note

Installation using uv on the Windows platform is not supported due to technical reasons.

Installation

Run the following command:

pip install llm-tracekit[litellm]

Authentication

Authentication data is passed during the instrumentor object's creation. You can do this in one of two ways:

Passing arguments to the constructor

You can pass the token, endpoint, and other parameters directly when initializing the LiteLLMInstrumentor.

from llm_tracekit import LiteLLMInstrumentor

instrumentor = LiteLLMInstrumentor(
    coralogix_token="<your_coralogix_token>",
    coralogix_endpoint="<your_coralogix_endpoint>",
    application_name="<ai-application>",
    subsystem_name="<ai-subsystem>"
)

Using environment variables

If arguments are not passed to the constructor, the instrumentor will automatically use the following environment variables:

  • CX_TOKEN: Your Coralogix API key
  • CX_ENDPOINT: The endpoint associated with your Coralogix domain
  • CX_APPLICATION_NAME: Your application's name
  • CX_SUBSYSTEM_NAME: Your subsystem's name

Usage

This section describes how to set up instrumentation for the LiteLLM SDK.

Set up tracing

Instrument

To instrument all clients, create an instance of LiteLLMInstrumentor and call the instrument method.

from llm_tracekit import LiteLLMInstrumentor

# Arguments can be passed here or set as environment variables
instrumentor = LiteLLMInstrumentor()
instrumentor.instrument()

Uninstrument

To uninstrument clients, call the uninstrument method.

instrumentor.uninstrument()

Full example

import litellm
from llm_tracekit import LiteLLMInstrumentor

# Activate instrumentation
# Coralogix connection details are read from environment variables:
# - CX_TOKEN
# - CX_ENDPOINT
# - CX_APPLICATION_NAME
# - CX_SUBSYSTEM_NAME
instrumentor = LiteLLMInstrumentor()
instrumentor.instrument()

# LiteLLM SDK Usage Example
response = litellm.completion(
    model="gpt-4o-mini",
    messages=[{"content": "What is the capital of Italy?", "role": "user"}]
)

print(response)

Enable message content capture

By default, message content, such as the contents of the prompt, completion, function arguments and return values, are not captured.

To capture message content as span attributes, set the environment variable OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT to true.

Most Coralogix AI evaluations require message contents to function properly, so enabling message capture is strongly recommended.

Key differences from OpenTelemetry

User prompts and model responses are captured as span attributes instead of log events, as detailed below.

Semantic conventions

AttributeTypeDescriptionExample
gen_ai.operation.namestringThe specific name of the operation being performedchat
gen_ai.systemstringThe provider or framework responsible for the operationopenai, anthropic, cohere
gen_ai.request.modelstringThe name of the model that the user or application requestedgpt-4o-mini
gen_ai.request.temperaturefloatThe 'temperature' parameter passed in the request. It controls the randomness of the output: higher values (e.g., 1.0) make the output more random, while lower values make it more deterministic.1.0
gen_ai.request.top_pfloatThe 'top_p' parameter used for nucleus sampling. The model considers only the tokens comprising the top 'p' probability mass, serving as an alternative to temperature for controlling randomness.1.0
gen_ai.prompt.<message_number>.rolestringRole of message author for user messagesystem, user, assistant, tool
gen_ai.prompt.<message_number>.contentstringContents of user messageWhat's the weather in Paris?
gen_ai.prompt.<message_number>.tool_calls.<tool_call_number>.idstringID of tool call in user messagecall_yPIxaozNPCSp1tJ34Hsbdtzg
gen_ai.prompt.<message_number>.tool_calls.<tool_call_number>.typestringType of tool call in user messagefunction
gen_ai.prompt.<message_number>.tool_calls.<tool_call_number>.function.namestringThe name of the function used in tool call within user messageget_current_weather
gen_ai.prompt.<message_number>.tool_calls.<tool_call_number>.function.argumentsstringArguments passed to the function used in tool call within user message{"location": "Paris"}
gen_ai.prompt.<message_number>.tool_call_idstringTool call ID in user messagecall_mszuSIzqtI65i1wAUOE8w5H4
gen_ai.completion.<choice_number>.rolestringRole of message author for choice in model responseassistant
gen_ai.completion.<choice_number>.finish_reasonstringFinish reason for choice in model responsestop, tool_calls, error
gen_ai.completion.<choice_number>.contentstringContents of choice in model responseThe weather in Paris is rainy and overcast, with temperatures around 57°F
gen_ai.completion.<choice_number>.tool_calls.<tool_call_number >.idstringID of tool call in choicecall_O8NOz8VlxosSASEsOY7LDUcP
gen_ai.completion.<choice_number>.tool_calls.<tool_call_number >.typestringType of tool call in choicefunction
gen_ai.completion.<choice_number>.tool_calls.<tool_call_number >.function.namestringThe name of the function used in tool call within choiceget_current_weather
gen_ai.completion.<choice_number>.tool_calls.<tool_call_number >.function.argumentsstringArguments passed to the function used in tool call within choice{"location": "Paris"}
gen_ai.response.modelstringThe exact name of the model that produced the response.gpt-4o-mini-2024-07-18
gen_ai.response.idstringA unique identifier assigned to the specific completionchatcmpl-CEaLMZn6bfTEOKFumw5IdMFiZ657a
gen_ai.usage.input_tokensintThe number of tokens consumed by the prompt sent to model66
gen_ai.usage.output_tokensintThe number of tokens generated in the model response44