OpenAI Agents SDK
Coralogix's AI Observability integrations are designed to provide deep insight into complex agentic AI applications. Through a dedicated integration with the OpenAI Agents SDK, Coralogix delivers end-to-end visibility into how your agents interact, collaborate, and utilize tools. This helps teams monitor the flow of tasks across Handoffs, analyze tool performance, and optimize the entire agentic system for efficiency and accuracy.
Overview
This library offers customized instrumentation for the OpenAI Agents SDK, optimized to support the development of production-ready agentic applications. It provides streamlined integration and leverages the SDK's native tracing capabilities to offer detailed visibility into agent behavior, including agent loops, Handoffs between agents, Guardrail validations, and tool function calls. This enables effective debugging, performance analysis, and a clear understanding of your entire agentic workflow.
Requirements
- Python version 3.9 and above.
- Coralogix API keys.
Installation
Run the following command.
Authentication
Authentication data is passed during OTel Span Exporter definition:
- Select the endpoint associated with your Coralogix domain .
- Use your customized API key in the authorization request header.
- Provide the application and subsystem names.
Note
All of the authentication parameters can also be provided through environment variables (CX_TOKEN
, CX_ENDPOINT
, etc.).
Usage
This section describes how to set up instrumentation for OpenAI Agents SDK.
Set up tracing
Automatic
Use the setup_export_to_coralogix
function to set up tracing and export traces to Coralogix. See code snippet in the Authentication section
Manual
Alternatively, you can set up tracing manually.
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import SERVICE_NAME, Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
tracer_provider = TracerProvider(
resource=Resource.create({SERVICE_NAME: "ai-service"}),
)
exporter = OTLPSpanExporter()
span_processor = SimpleSpanProcessor(exporter)
tracer_provider.add_span_processor(span_processor)
trace.set_tracer_provider(tracer_provider)
Instrument
To instrument all clients, call the instrument
method.
Uninstrument
To uninstrument clients, call the uninstrument
method.
Full example
import asyncio
from src.llm_tracekit import setup_export_to_coralogix, OpenAIInstrumentor
from agents import Agent, Runner
# Optional: Configure sending spans to Coralogix
# Reads Coralogix connection details from the following environment variables:
# - CX_TOKEN
# - CX_ENDPOINT
setup_export_to_coralogix(
service_name="ai-service",
application_name="ai-application",
subsystem_name="ai-subsystem",
capture_content=True,
)
async def main():
# Activate instrumentation
OpenAIInstrumentor().instrument()
# OpenAI Agents SDK Usage Example
math_tutor_agent = Agent(
name="Math Tutor",
model="gpt-4o-mini",
instructions="You provide help with math problems."
)
prompt = "A circle has a radius of 10 cm. What is its area?"
result = await Runner.run(math_tutor_agent, prompt)
if __name__ == "__main__":
asyncio.run(main())
Enable message content capture
By default, message content, such as the contents of the prompt, completion, function arguments and return values, are not captured. To capture message content as span attributes:
Pass
capture_content=True
when callingsetup_export_to_coralogix
.Set the environment variable
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT
totrue
.
Most Coralogix AI evaluations require message contents to function properly, so enabling message capture is strongly recommended.
Key differences from OpenTelemetry
This instrumentation goes beyond standard LLM tracing by introducing custom spans that map directly to the core primitives of the OpenAI Agents SDK. This provides a structured, hierarchical view of the entire agentic workflow. * Agent, Guardrail, Handoff, and Function Spans: Instead of a flat list of LLM calls, the trace is organized into distinct span types: Agent (the orchestrator), Guardrail (validation steps), Handoff (delegation between agents), and Function (tool execution). * Agent-Specific Context on LLM Spans: Standard LLM call spans are enriched with new attributes like gen_ai.agent.name
and gen_ai.agent.instruction
. * Structured Handoff and Tool Data: Information about which agents are available for handoff and which tools are used is captured directly in the Agent span attributes, providing a clear overview of an agent's capabilities at a glance. * Error Propagation: The status of a span is used to clearly indicate failures in the agentic flow.
Semantic conventions
This integration adds the following span types and attributes to describe the agentic workflow.
Agent spans
These spans represent the execution of a single agent. They act as parents for LLM calls, guardrails, and handoffs initiated by that agent. | Attribute | Type | Description | Example | |---------------|----------|----------------------------------------------------------------------|------------------| | type
| string | The type of the span, identifying it as an agent execution. | agent
| | agent_name
| string | The name of the agent being executed. | Assistant
| | handoffs
| string[] | A list of other agents that this agent is capable of handing off to. | ["WeatherAgent"]
| | tools
| string[] | A list of tools (functions) available to this agent. | ["get_current_weather"]
| | output_type
| string | The expected data type of the agent's final output. | MessageOutput
|
Guardrail spans
These spans represent the execution of a guardrail check. | Attribute | Type | Description | Example | |---------------|----------|--------------------------------------------------------------------|---------------| | type
| string | The type of the span, identifying it as a guardrail. | guardrail
| | name
| string | The unique name of the guardrail being executed. | MathGuardrail
| | triggered
| boolean | Indicates whether the guardrail condition was met (and triggered). | false
|
Handoff spans
These spans represent the moment an agent attempts to delegate a task to another agent. | Attribute | Type | Description | Example | |---------------|----------|--------------------------------------------------------|--------------| | type
| string | The type of the span, identifying it as a handoff. | handoff
| | from_agent
| string | The name of the agent initiating the handoff. | Assistant
| | to_agent
| string | The name of the agent intended to receive the handoff. | WeatherAgent
|
Function spans
These spans represent the execution of a tool (a Python function). | Attribute | Type | Description | Example | |---------------|----------|-----------------------------------------------------------|--------------------------------------------| | type
| string | The type of the span, identifying it as a function. | function
| | name
| string | The name of the function that was called. | get_current_weather
| | input
| string | The JSON string of arguments passed to the function. | {"city":"Tel Aviv"}
| | output
| string | The string representation of the function's return value. | The weather in Tel Aviv is 30°C and sunny.
|
Enriched LLM call spans
These attributes are added to the existing ResponseSpanData
to link LLM calls back to the responsible agent. | Attribute | Type | Description | Example | |--------------------------|----------|-----------------------------------------------------|-------------------------------------------------------------------------------------| | gen_ai.agent.name
| string | The name of the agent that initiated this LLM call. | Assistant
, WeatherAgent
|
Agents SDK-specific attributes
Attribute | Type | Description | Example |
---|---|---|---|
gen_ai.prompt.<message_number>.role | string | Role of message author for user message | user , assistant , tool |
gen_ai.prompt.<message_number>.content | string | Contents of user message | What's the weather in Tel Aviv? |
gen_ai.prompt.<message_number>.tool_calls.<tool_call_number>.id | string | ID of tool call in user message | call_O8NOz8VlxosSASEsOY7LDUcP |
gen_ai.prompt.<message_number>.tool_calls.<tool_call_number>.function.name | string | The name of the function used in tool call within user message | get_current_weather |
gen_ai.prompt.<message_number>.tool_calls.<tool_call_number>.function.arguments | string | Arguments passed to the function used in tool call within user message | {"city": "Tel Aviv"} |
gen_ai.prompt.<message_number>.tool_call_id | string | Tool call ID in user message | call_mszuSIzqtI65i1wAUOE8w5H4 |
gen_ai.completion.<choice_number>.role | string | Role of message author for choice in model response | assistant |
gen_ai.completion.<choice_number>.finish_reason | string | Finish reason for choice in model response | completed , error |
gen_ai.completion.<choice_number>.content | string | Contents of choice in model response | The weather in Tel Aviv is 30°C and sunny. |
Attribute | Type | Description | Example |
---|---|---|---|
gen_ai.request.model | string | The model identifier requested to perform the operation. | gpt-4o-mini-2024-07-18 |
gen_ai.request.temperature | float | The 'temperature' parameter passed in the request. It controls the randomness of the output: higher values (e.g., 1.0) make the output more random, while lower values make it more deterministic. | 1.0 |
gen_ai.request.top_p | float | The 'top_p' parameter used for nucleus sampling. The model considers only the tokens comprising the top 'p' probability mass, serving as an alternative to temperature for controlling randomness. | 1.0 |
gen_ai.response.model | string | The identifier of the model that actually generated the response. This may differ from gen_ai.request.model in cases of provider-side model updates or aliasing. | gpt-4o-2024-08-06 |
gen_ai.response.id | string | The unique identifier for the response generated by the API. This is useful for correlating with provider-side logs or for debugging purposes. | resp_6880c9985a98819f99976590f0717f760621b19adbeecfb2 |
gen_ai.usage.input_tokens | int | The number of tokens in the prompt or input sent to the model. | 243 |
gen_ai.usage.output_tokens | int | The number of tokens generated by the model in the completion. | 25 |