window
Description
Applies a window function over a set of related rows and adds the result as a new column to every row in the result set.
- A
windowoperation does not collapse rows (unlikegroupby). It enriches each row with a computed value. - You can optionally:
- Partition the data into independent groups using
by - Order rows inside each partition using
orderby - Define a frame to limit which rows are considered for each output row (e.g., “previous 10 rows”)
Note
A single window operator supports one window function call inside calc. To add multiple window-based columns, chain multiple window operators.
Warning
If you use an orderby clause, the window operator will re-order the result set by the orderby expressions.
Syntax
window
[ by <expr> [, <expr> ...] ]
[ orderby <expr> [asc|desc] [, <expr> [asc|desc] ...] ]
[ <frame-type> <frame-function> ]
calc <window-function-expression> as <alias>
Arguments
| Name | Type | Required | Description |
|---|---|---|---|
| by | list[scalar-expression] | false | Partitions the current result set into independent groups (“buckets”). The window calculation is performed separately within each partition. If omitted, the window is computed over the entire result set (no partitioning). |
| orderby | list[scalar-expression (asc|desc)] | false | Defines the row order within each partition. |
| frame-type | enum | false | Defines how the window frame is interpreted. |
| frame-function | frame-function | false | Defines which rows are included in the calculation **relative to the current row. |
frame-type
rows— frame boundaries are based on row counts (e.g., previous 10 rows)range— frame boundaries are based on the ordered value (commonly time intervals when ordering by timestamp)groups— frame boundaries are based on peer groups according to theorderbyclause
If omitted, a default frame is applied (see frame-function).
frame-function
Supported frame functions:
between(from, to)— explicit frame bounds (bothfromandtomust be literals)all()— equivalent tobetween(null, null)running()— cumulative window up to the current row (equivalent tobetween(to=0))previous(n)— previous n rows / groups / interval, excluding the current row (between(-n, -1))next(n)— next n rows / groups / interval, excluding the current row (between(1, n))lookbehind(n)— includes current row and n rows / groups / interval behind (between(-n, 0))lookahead(n)— includes current row and n rows / groups / interval ahead (between(0, n))lookaround(n)— includes current row and n rows / groups / interval on both sides (between(-n, n))
Mapping of between(from, to) to SQL-style semantics:
from = null→ unbounded precedingto = null→ unbounded followingfrom/to = 0→ current rowfrom/to = -1→ 1 precedingfrom/to = 1→ 1 following
Defaults (Postgres-like semantics):
- If
orderbyis omitted → default frame isrows all() - If
orderbyis present → default frame isgroups running()
Validations:
lookahead,lookbehind,lookaround,next,previous:nmust be a non-negative literalbetween(from, to):fromandtomust be literals
Rules:
- Must contain exactly one window function call.
- Can include normal scalar operations around the window function (e.g.,
value / avg(value)). - The result can be named using
as <alias>.
Supported window functions include:
Ranking
dense_rank()— Returns an integer rank (starting at1) within the partition, without gaps. Peer rows (rows with equalorderbyvalues) share the same rank, and the next distinctorderbyvalue increases the rank by 1.percent_rank()— Returns the relative rank as a fraction in[0, 1], computed as(rank() - 1) / (count(*) - 1)within the partition. Returns0when the partition contains a single row. Because it’s based onrank(), it may not reach1if the last rows are peers (ties).rank()— Returns an integer rank (starting at1) within the partition, with gaps. Peer rows share the same rank, and the next rank jumps by the number of peer rows (e.g., ordered values[10, 10, 20]→ ranks[1, 1, 3]).row_number()— Returns a unique sequential integer for each row (starting at1) in the partition, ordered byorderby. Rows that compare equal inorderbystill get different row numbers; add additionalorderbyexpressions to make tie-breaking deterministic.cume_dist()— Returns the cumulative distribution as a fraction in(0, 1], computed ascount(rows with orderby value <= current row’s value) / count(*)within the partition. All peer rows return the same value.ntile(n: number)— Splits the ordered partition intonbuckets as evenly as possible and returns the 1-based bucket number for the current row. Bucket sizes differ by at most1; if the rows don’t divide evenly, earlier buckets contain one extra row.nmust be a positive integer.
Value
lag(expr, offset?: number = 1, default?: any = null)lead(expr, offset?: number = 1, default?: any = null)first_value(expr)nth_value(expr, n: number)last_value(expr)
Aggregations
- Aggregation functions can also be used as window functions.
- All aggregation functions work for cumulative frames (e.g.,
all(),running()). - For sliding frames (where rows can leave the window), only these aggregates are supported:
avg(),count(),max(),min(),sum(),stddev(),variance(),sample_stddev(),sample_variance()
Example 1
Use case: Keep one DataPrime query per user based on bytes scanned
Partition queries by user, rank each user’s queries using the orderby clause, and then keep only the row with rank == 1.
Example data
{ "clientInfo": { "userEmail": "[email protected]" }, "queryInfo": { "queryId": "q1", "interfaceType": "dataprime", "queryOutcome": { "storage": { "stats": { "bytesScanned": 1200 } } } } },
{ "clientInfo": { "userEmail": "[email protected]" }, "queryInfo": { "queryId": "q2", "interfaceType": "dataprime", "queryOutcome": { "storage": { "stats": { "bytesScanned": 900 } } } } },
{ "clientInfo": { "userEmail": "[email protected]" }, "queryInfo": { "queryId": "q3", "interfaceType": "dataprime", "queryOutcome": { "storage": { "stats": { "bytesScanned": 5000 } } } } },
{ "clientInfo": { "userEmail": "[email protected]" }, "queryInfo": { "queryId": "q4", "interfaceType": "dataprime", "queryOutcome": { "storage": { "stats": { "bytesScanned": 3000 } } } } }
Example query
source system/engine.queries
| filter queryInfo.interfaceType == 'dataprime'
| window by clientInfo.userEmail
orderby queryInfo.queryOutcome.storage.stats.bytesScanned
calc rank() as rank
| filter rank == 1
| choose clientInfo.userEmail as user, queryInfo.queryId,queryInfo.queryOutcome.storage.stats.bytesScanned
Example output
| user | queryId | bytesScanned |
|---|---|---|
| [email protected] | q2 | 900 |
| [email protected] | q4 | 3000 |
Example 2
Use case: Keep the top 3 most recent log entries per application
Partition logs by applicationname, rank each entry from newest to oldest using dense_rank(), then keep only the first 3 ranks per partition. This pattern returns top N rows per group — distinct from the top command, which returns the top N groups overall.
Example data
{ "timestamp": "2026-05-11T12:40:23Z", "applicationname": "cx-metrics" },
{ "timestamp": "2026-05-11T12:39:51Z", "applicationname": "cx-metrics" },
{ "timestamp": "2026-05-11T12:39:31Z", "applicationname": "cx-metrics" },
{ "timestamp": "2026-05-11T12:38:00Z", "applicationname": "cx-metrics" },
{ "timestamp": "2026-05-11T12:39:09Z", "applicationname": "claude-code" },
{ "timestamp": "2026-05-11T12:39:08Z", "applicationname": "claude-code" },
{ "timestamp": "2026-05-11T12:30:00Z", "applicationname": "claude-code" }
Example query
source application_logs
| window by applicationname
orderby timestamp desc
calc dense_rank() as rank
| filter rank <= 3
| choose timestamp, applicationname
Example output
| timestamp | applicationname |
|---|---|
| 2026-05-11T12:40:23Z | cx-metrics |
| 2026-05-11T12:39:51Z | cx-metrics |
| 2026-05-11T12:39:31Z | cx-metrics |
| 2026-05-11T12:39:09Z | claude-code |
| 2026-05-11T12:39:08Z | claude-code |
| 2026-05-11T12:30:00Z | claude-code |
Example 3
Use case: Detect slow HTTP requests compared to the previous 10 requests (per method)
Order requests by time, then compute the average duration over the previous 10 rows in the same HTTP method partition. Finally, filter requests that are more than 3× slower than that baseline.
Example data
{ "method": "GET", "timestamp": "2026-02-21T10:00:00Z", "duration_ms": 120 },
{ "method": "GET", "timestamp": "2026-02-21T10:00:01Z", "duration_ms": 110 },
{ "method": "GET", "timestamp": "2026-02-21T10:00:02Z", "duration_ms": 115 },
{ "method": "GET", "timestamp": "2026-02-21T10:00:03Z", "duration_ms": 900 },
{ "method": "POST", "timestamp": "2026-02-21T10:00:00Z", "duration_ms": 200 },
{ "method": "POST", "timestamp": "2026-02-21T10:00:01Z", "duration_ms": 210 },
{ "method": "POST", "timestamp": "2026-02-21T10:00:02Z", "duration_ms": 205 },
{ "method": "POST", "timestamp": "2026-02-21T10:00:03Z", "duration_ms": 800 }
Example query
source http_logs
| window by method
orderby timestamp asc
rows previous(10)
calc avg(duration_ms) as avg_prev_10_ms
| filter duration_ms > avg_prev_10_ms * 3
| choose method, timestamp, duration_ms, avg_prev_10_ms
Example output
| method | timestamp | duration_ms | avg_prev_10_ms |
|---|---|---|---|
| GET | 2026-02-21T10:00:03Z | 900 | 115 |
| POST | 2026-02-21T10:00:03Z | 800 | 205 |