Aggregating data

Goal

By the end of this guide you should be able to use aggregate, groupby, avg, count_if, and other functions to calculate metrics such as averages, totals, and conditional counts from your logs or traces.

Why it matters

Understanding what your systems are doing at scale often requires more than just viewing raw logs, it requires summarization. Aggregations let you answer questions like “Which endpoints are the slowest?”, “How many error traces exceeded 1s?” or “What is the average request size per service?” without manually inspecting individual events.

Count all logs or traces

Description

Use count() to return the total number of documents in the dataset or current time range. This is the simplest and fastest way to verify volume before applying filters or grouping.

Syntax

count(): number

Example: Count total number of trace spans

Sample data

{ "trace_id": "t1", "duration": 500 }
{ "trace_id": "t2", "duration": 620 }
{ "trace_id": "t3", "duration": 80 }

Query

source spans
| aggregate count() as total_spans

Result

total_spans
3

Group by a field and calculate an average

Description

Use groupby with avg() to measure how a numerical value (e.g., duration, size, latency) varies by category.

Syntax

groupby <key> aggregate avg(<number_field>) as <alias>

Example: Compute average span duration per region

Sample data

{ "cloud_region": "eu-north-1", "duration": 500 }
{ "cloud_region": "eu-north-1", "duration": 654 }
{ "cloud_region": "eu-west-1", "duration": 2100 }
{ "cloud_region": "us-east-2", "duration": 150 }

Query

source spans
| groupby cloud_region aggregate avg(duration) as avg_duration

Result

cloud_region	avg_duration
eu-north-1	577.0
eu-west-1	2100.0
us-east-2	150.0

This highlights regional performance differences. You can use this pattern to monitor latency by path, status code, or service.

Add multiple aggregations at once

Description

You can combine multiple aggregation functions in a single query to analyze different metrics simultaneously.

Example: Measure average, max, and count of spans per region

Sample data

{ "cloud_region": "eu-north-1", "duration": 500 }
{ "cloud_region": "eu-north-1", "duration": 900 }
{ "cloud_region": "eu-west-1", "duration": 2100 }
{ "cloud_region": "us-east-2", "duration": 150 }

Query

source spans
| groupby cloud_region aggregate
    avg(duration) as avg_duration,
    max(duration) as max_duration,
    count() as request_count

Result

cloud_region	avg_duration	max_duration	request_count
eu-north-1	700.0	900	2
eu-west-1	2100.0	2100	1
us-east-2	150.0	150	1

This approach is useful for analyzing scale and performance in the same query.

Filter before aggregating

Description

Use filter to narrow down the dataset before aggregating. This improves performance and ensures your calculations reflect only relevant data.

Example: Count only slow requests per region

Sample data

{ "cloud_region": "eu-north-1", "duration": 900 }
{ "cloud_region": "eu-north-1", "duration": 300 }
{ "cloud_region": "us-east-2", "duration": 800 }

Query

source spans
| filter duration > 500
| groupby cloud_region aggregate count() as slow_request_count

Result

cloud_region	slow_request_count
eu-north-1	1
us-east-2	1

The filter ensures that only spans above 500ms are included in the aggregation.

Use `count_if` for conditional metrics

Description

Use count_if() to count how many documents meet a specific condition within a group.

Syntax

count_if(condition: bool): number

Example: Compare total and slow spans per region

Sample data

{ "cloud_region": "eu-west-1", "duration": 700 }
{ "cloud_region": "eu-west-1", "duration": 400 }
{ "cloud_region": "us-east-2", "duration": 300 }
{ "cloud_region": "us-east-2", "duration": 1200 }

Query

source spans
| groupby cloud_region aggregate
    count() as total_requests,
    count_if(duration > 500) as slow_requests

Result

cloud_region	total_requests	slow_requests
eu-west-1	2	1
us-east-2	2	1

This lets you track the proportion of problematic spans per group.

Group by calculated severity level

Description

You can dynamically group documents using logic via a case expression. This is useful for creating buckets like "fast", "slow", and "critical".

Syntax

groupby case {
  <condition1> -> <label1>,
  <condition2> -> <label2>,
  _ -> <default_label>
} as <alias>
aggregate count() as <alias>

Example: Group spans into latency tiers

Sample data

{ "duration": 1200 }
{ "duration": 800 }
{ "duration": 300 }

Query

source spans
| groupby case {
    duration > 1000 -> 'super slow',
    duration > 500 -> 'slow',
    _ -> 'ok'
  } as latency_level
| aggregate count() as request_count

Result

latency_level	request_count
super slow	1
slow	2
ok	8

Aggregate over spans (traces)

Description

You can aggregate directly on spans using basic metrics like count and max. This is often used to get an overview of tracing volume and performance.

Example: Count spans and find longest one

Sample data

{ "duration": 8000 }
{ "duration": 20000 }
{ "duration": 1000 }

Query

source spans
| aggregate
    count() as trace_count,
    max(duration) as max_trace_duration

Result

trace_count	max_trace_duration
3	20000

This is helpful for validating trace ingestion and spotting extreme latency outliers.

Common pitfalls or gotchas

groupby removes all fields not explicitly included. Add fields as grouping keys or aggregations if you want to preserve them.
Aggregations like avg, count_if, and max ignore null values. Make sure fields you're aggregating on are populated.
You can only group by scalar values—arrays and objects are not supported unless flattened or transformed first.

Previous Overview

Next Isolate and shape logs

Aggregating data

Goal

Why it matters

Count all logs or traces

Description

Syntax

Example: Count total number of trace spans

Sample data

Query

Result

Group by a field and calculate an average

Description

Syntax

Example: Compute average span duration per region

Sample data

Query

Result

Add multiple aggregations at once

Description

Example: Measure average, max, and count of spans per region

Sample data

Query

Result

Filter before aggregating

Description

Example: Count only slow requests per region

Sample data

Query

Result

Use count_if for conditional metrics

Description

Syntax

Example: Compare total and slow spans per region

Sample data

Query

Result

Group by calculated severity level

Description

Syntax

Example: Group spans into latency tiers

Sample data

Query

Result

Aggregate over spans (traces)

Description

Example: Count spans and find longest one

Sample data

Query

Result

Common pitfalls or gotchas

Use `count_if` for conditional metrics