Understating Traces and Spans
Tracing helps you understand how requests move through your system and why they behave the way they do. It is the starting point for investigating latency, errors, and unexpected behavior across distributed services.
This page introduces the core concepts behind tracing and explains why traces and spans matter.
Why tracing matters
Modern systems are distributed. A single user action such as loading a page, submitting a form, completing a checkout, can trigger work across many services, databases, queues, and third‑party APIs. When something goes wrong, metrics can tell you that there is a problem, and logs can show individual events, but neither explains how the full request behaved end to end.
Distributed tracing closes this gap by showing:
- What happened across services
- The order in which operations occurred
- How long each step took
- Where errors, bottlenecks, or unexpected behavior were introduced
The Explore tracing screen turns tracing data into an investigation workflow, helping you move from symptoms to root cause.
Core concepts
What is a trace?
A trace represents the complete lifecycle of a single request or transaction as it travels through your system.
A trace answers questions such as:
- Why was this request slow?
- Why did this request fail?
- Which services were involved in handling it?
Each trace captures the full story of one execution path, from the first incoming request to the final response.
What is a span?
A span represents a single operation within a trace.
Examples of spans include:
- An incoming HTTP request
- A call to another service
- A database query
- A cache lookup
Each span contains timing information and contextual data, such as service name, operation, attributes, tags, and error details.
What is a root span?
Root span (entry span) is the first span in the trace and represents the end-to-end request (for example, an incoming HTTP request). All other spans are downstream work triggered by it, directly or indirectly. When troubleshoot latency, the root span duration is usually the “total time,” while child spans explain where that time was spent. Root span is often the top-most parent. In distributed systems there may be multiple “entry” spans per service, but the trace still has a single root in most instrumentation.
How spans relate within a trace
Spans are connected using parent‑child relationships that reflect how work flows through your system:
- A parent span represents a higher‑level operation.
- Child spans represent downstream work triggered by that operation.
Together, these relationships form a structured view of execution order, parallelism, and dependencies. Understanding this structure is essential for identifying where latency accumulates or where failures originate.
Rate, error, and duration (RED) metrics
Explore tracing surfaces RED metrics computed from spans or trace at root span level, depended to the view you check, so you can quickly assess traffic (Request), failures (Error), and latency distribution (Duration).
| Useful for investigating | Metric | Meaning |
|---|---|---|
| Sudden increases in traffic | Rate | Requests per second |
| Failures across services or instrumentation gaps | Error | Number of requests that fail |
| Slow responses and latency patterns | Duration | Time spent per request, shown as a histogram |
