`window`

Description

Applies a window function over a set of related rows and adds the result as a new column to every row in the result set.

A window operation does not collapse rows (unlike groupby). It enriches each row with a computed value.
You can optionally:
Partition the data into independent groups using by
Order rows inside each partition using orderby
Define a frame to limit which rows are considered for each output row (e.g., “previous 10 rows”)

Note

A single window operator supports one window function call inside calc. To add multiple window-based columns, chain multiple window operators.

Warning

If you use an orderby clause, the window operator will re-order the result set by the orderby expressions.

Syntax

window
  [ by <expr> [, <expr> ...] ]
  [ orderby <expr> [asc|desc] [, <expr> [asc|desc] ...] ]
  [ <frame-type> <frame-function> ]
  calc <window-function-expression> as <alias>

Arguments

Name	Type	Required	Description
by	list[scalar-expression]	false	Partitions the current result set into independent groups (“buckets”). The window calculation is performed separately within each partition. If omitted, the window is computed over the entire result set (no partitioning).
orderby	list[scalar-expression (asc\|desc)]	false	Defines the row order within each partition.
frame-type	enum	false	Defines how the window frame is interpreted.
frame-function	frame-function	false	Defines which rows are included in the calculation **relative to the current row.

frame-type

rows — frame boundaries are based on row counts (e.g., previous 10 rows)
range — frame boundaries are based on the ordered value (commonly time intervals when ordering by timestamp)
groups — frame boundaries are based on peer groups according to the orderby clause

If omitted, a default frame is applied (see frame-function).

frame-function

Supported frame functions:

between(from, to) — explicit frame bounds (both from and to must be literals)
all() — equivalent to between(null, null)
running() — cumulative window up to the current row (equivalent to between(to=0))
previous(n) — previous n rows / groups / interval, excluding the current row (between(-n, -1))
next(n) — next n rows / groups / interval, excluding the current row (between(1, n))
lookbehind(n) — includes current row and n rows / groups / interval behind (between(-n, 0))
lookahead(n) — includes current row and n rows / groups / interval ahead (between(0, n))
lookaround(n) — includes current row and n rows / groups / interval on both sides (between(-n, n))

Mapping of between(from, to) to SQL-style semantics:

from = null → unbounded preceding
to = null → unbounded following
from/to = 0 → current row
from/to = -1 → 1 preceding
from/to = 1 → 1 following

Defaults (Postgres-like semantics):

If orderby is omitted → default frame is rows all()
If orderby is present → default frame is groups running()

Validations:

lookahead, lookbehind, lookaround, next, previous: n must be a non-negative literal
between(from, to): from and to must be literals

Rules:

Must contain exactly one window function call.
Can include normal scalar operations around the window function (e.g., value / avg(value)).
The result can be named using as <alias>.

Supported window functions include:

Ranking

dense_rank() — Returns an integer rank (starting at 1) within the partition, without gaps. Peer rows (rows with equal orderby values) share the same rank, and the next distinct orderby value increases the rank by 1.
percent_rank() — Returns the relative rank as a fraction in [0, 1], computed as (rank() - 1) / (count(*) - 1) within the partition. Returns 0 when the partition contains a single row. Because it’s based on rank(), it may not reach 1 if the last rows are peers (ties).
rank() — Returns an integer rank (starting at 1) within the partition, with gaps. Peer rows share the same rank, and the next rank jumps by the number of peer rows (e.g., ordered values [10, 10, 20] → ranks [1, 1, 3]).
row_number() — Returns a unique sequential integer for each row (starting at 1) in the partition, ordered by orderby. Rows that compare equal in orderby still get different row numbers; add additional orderby expressions to make tie-breaking deterministic.
cume_dist() — Returns the cumulative distribution as a fraction in (0, 1], computed as count(rows with orderby value <= current row’s value) / count(*) within the partition. All peer rows return the same value.
ntile(n: number) — Splits the ordered partition into n buckets as evenly as possible and returns the 1-based bucket number for the current row. Bucket sizes differ by at most 1; if the rows don’t divide evenly, earlier buckets contain one extra row. n must be a positive integer.

Value

lag(expr, offset?: number = 1, default?: any = null)
lead(expr, offset?: number = 1, default?: any = null)
first_value(expr)
nth_value(expr, n: number)
last_value(expr)

Aggregations

Aggregation functions can also be used as window functions.
All aggregation functions work for cumulative frames (e.g., all(), running()).
For sliding frames (where rows can leave the window), only these aggregates are supported: avg(), count(), max(), min(), sum(), stddev(), variance(), sample_stddev(), sample_variance()

Example 1

Use case: Keep one DataPrime query per user based on bytes scanned

Partition queries by user, rank each user’s queries using the orderby clause, and then keep only the row with rank == 1.

Example data

{ "clientInfo": { "userEmail": "[email protected]" }, "queryInfo": { "queryId": "q1", "interfaceType": "dataprime", "queryOutcome": { "storage": { "stats": { "bytesScanned": 1200 } } } } },
{ "clientInfo": { "userEmail": "[email protected]" }, "queryInfo": { "queryId": "q2", "interfaceType": "dataprime", "queryOutcome": { "storage": { "stats": { "bytesScanned": 900 } } } } },
{ "clientInfo": { "userEmail": "[email protected]" }, "queryInfo": { "queryId": "q3", "interfaceType": "dataprime", "queryOutcome": { "storage": { "stats": { "bytesScanned": 5000 } } } } },
{ "clientInfo": { "userEmail": "[email protected]" }, "queryInfo": { "queryId": "q4", "interfaceType": "dataprime", "queryOutcome": { "storage": { "stats": { "bytesScanned": 3000 } } } } }

Example query

source system/engine.queries
| filter queryInfo.interfaceType == 'dataprime'
| window by clientInfo.userEmail
    orderby queryInfo.queryOutcome.storage.stats.bytesScanned
    calc rank() as rank
| filter rank == 1
| choose clientInfo.userEmail as user, queryInfo.queryId,queryInfo.queryOutcome.storage.stats.bytesScanned

Example output

user	queryId	bytesScanned
[email protected]	q2	900
[email protected]	q4	3000

Example 2

Use case: Keep the top 3 most recent log entries per application

Partition logs by applicationname, rank each entry from newest to oldest using dense_rank(), then keep only the first 3 ranks per partition. This pattern returns top N rows per group — distinct from the top command, which returns the top N groups overall.

Example data

{ "timestamp": "2026-05-11T12:40:23Z", "applicationname": "cx-metrics" },
{ "timestamp": "2026-05-11T12:39:51Z", "applicationname": "cx-metrics" },
{ "timestamp": "2026-05-11T12:39:31Z", "applicationname": "cx-metrics" },
{ "timestamp": "2026-05-11T12:38:00Z", "applicationname": "cx-metrics" },
{ "timestamp": "2026-05-11T12:39:09Z", "applicationname": "claude-code" },
{ "timestamp": "2026-05-11T12:39:08Z", "applicationname": "claude-code" },
{ "timestamp": "2026-05-11T12:30:00Z", "applicationname": "claude-code" }

Example query

source application_logs
| window by applicationname
    orderby timestamp desc
    calc dense_rank() as rank
| filter rank <= 3
| choose timestamp, applicationname

Example output

timestamp	applicationname
2026-05-11T12:40:23Z	cx-metrics
2026-05-11T12:39:51Z	cx-metrics
2026-05-11T12:39:31Z	cx-metrics
2026-05-11T12:39:09Z	claude-code
2026-05-11T12:39:08Z	claude-code
2026-05-11T12:30:00Z	claude-code

Example 3

Use case: Detect slow HTTP requests compared to the previous 10 requests (per method)

Order requests by time, then compute the average duration over the previous 10 rows in the same HTTP method partition. Finally, filter requests that are more than 3× slower than that baseline.

Example data

{ "method": "GET", "timestamp": "2026-02-21T10:00:00Z", "duration_ms": 120 },
{ "method": "GET", "timestamp": "2026-02-21T10:00:01Z", "duration_ms": 110 },
{ "method": "GET", "timestamp": "2026-02-21T10:00:02Z", "duration_ms": 115 },
{ "method": "GET", "timestamp": "2026-02-21T10:00:03Z", "duration_ms": 900 },
{ "method": "POST", "timestamp": "2026-02-21T10:00:00Z", "duration_ms": 200 },
{ "method": "POST", "timestamp": "2026-02-21T10:00:01Z", "duration_ms": 210 },
{ "method": "POST", "timestamp": "2026-02-21T10:00:02Z", "duration_ms": 205 },
{ "method": "POST", "timestamp": "2026-02-21T10:00:03Z", "duration_ms": 800 }

Example query

source http_logs
| window by method
    orderby timestamp asc
    rows previous(10)
    calc avg(duration_ms) as avg_prev_10_ms
| filter duration_ms > avg_prev_10_ms * 3
| choose method, timestamp, duration_ms, avg_prev_10_ms

Example output

method	timestamp	duration_ms	avg_prev_10_ms
GET	2026-02-21T10:00:03Z	900	115
POST	2026-02-21T10:00:03Z	800	205

Need help? Contact Support.

What's new? Find out here.

LLM? Read llms.txt.

Previous wildfind / wildtext

Next Functions overview