Breaking News from AWS re:Invent
Coralogix receives AWS Rising Star award!
This guide provides a full glossary of all available DataPrime operators and expressions.
To hit the ground running using DataPrime and to view only our most frequently-used operators with examples, view our DataPrime Cheat Sheet.
The negation of filter
. Filters-out all events where the condition is true
. The same effect can be achieved by using filter
with !(condition)
.
block $d.status_code >= 200 && $d.status_code <= 299 # Leave all events which don't have a status code of 2xx
The data is exposed using the following top-level fields:
$m
-Event metadata
timestamp
severity
– Possible values are Verbose
, Debug
, Info
, Warning
, Error
, Critical
priorityclass
– Possible values are high
, medium
, low
logid
$l
-Event labels
applicationname
subsystemname
category
classname
computername
methodname
threadid
ipaddress
$d
-The user’s dataNo grouping variation
Limits the rows returned to a specified number and order the result by a set of expressions.
order_direction := "descending"/"ascending" according to top/bottom bottom <limit> <result_expression1> [as <alias>] [, <result_expression2> [as <alias2>], ...] by <orderby_expression> [as alias>]
For example, the following query:
bottom 5 $m.severity as $d.log_severity by $d.duration
Will result in logs of the following form:
[ { "log_severity": "Debug", "duration": 1000 } { "log_severity": "Warning", "duration": 2000 }, ... ]
Grouping variation
Limits the rows returned to a specified number and group them by a set of aggregation expressions and order them by a set of expressions.
order_direction := "descending"/"ascending" according to top/bottom bottom <limit> <(groupby_expression1|aggregate_function1)> [as <alias>] [, <(groupby_expression2|aggregate_function2)> [as <alias2>], ...] by <(groupby_expression1|aggregate_function1)> [as <alias>]
For example, the following query:
bottom 10 $m.severity, count() as $d.number_of_severities by avg($d.duration) as $d.avg_duration
Will result in logs of the following form:
[ { "severity": "Warning", "number_of_severities": 50, avg_duration: 1000 }, { "severity": "Debug", "number_of_severities": 10, avg_duration: 2000 } ... ]
Supported aggregation functions are listed in “Aggregation Functions” section.
Leave only the keypaths provided, discarding all other keys. Fully supports nested keypaths in the output.
(choose|select) <keypath1> [as <new_keypath>],<keypath2> [as <new_keypath>],...
Examples:
choose $d.mysuperkey.myfield choose $d.my_superkey.mykey as $d.important_value, 10 as $d.the_value_ten
Convert the data types of keys.
The datatypes
keyword is optional and can be used for readability.
(conv|convert) [datatypes] <keypath1>:<datatype1>,<keypath2>:<datatype2>,...
Examples:
convert $d.level:number conv datatypes $d.long:number,$d.lat:number convert $d.data.color:number,$d.item:string
Returns a single row containing the number of rows produced by the preceding operators.
count [into <keypath>]
An alias can be provided to override the keypath the result will be written to.
For example, the following part of a query
count into $d.num_rows
will result in a single row of the following form:
{ "num_rows": 7532 }
Returns a row counting all the rows grouped by the expression.
countby <expression> [as <alias>] [into <keypath>]
An alias can be provided to override the keypath the result will be written into.
For example, the following part of a query
countby $d.verb into $d.verb_count
will result in a row for each group.
It is functionally identical to
groupby $data.verb calculate count() as $d.verb_count
Create a new key and set its value to the result of the expression. Key creation is granular, meaning that parent keys in the path are not overwritten.
(a|add|c|create) <keypath> from <expression> [on keypath exists (fail|skip|overwrite)] [on keypath missing (fail|create|skip)] [on datatype change (skip|fail|overwrite)
The creation can be controlled by adding the following clauses:
on keypath exists
allows to choose what to do when the keypath already exists.overwrite
– Overwrites the old value. This is the default valuefail
– Fails the queryskip
– Skips the creation of the keyon keypath missing
allows to choose what to do when the new keypath does not exist.create
– Creates the key. This is the default valuefail
– Fails the queryskip
– Skips the creation of the new keyon datatype changed
allows to choose what to do if the key already exists and the new data changes the datatype of the valueoverwrite
– Overwrites the value anyway. This is the default valuefail
– Fails the queryskip
– Leaves the key with the original value (and type)Examples:
create $d.radius from 100+23 c $d.log_data.truncated_message from $d.message.substring(1,50) c $data.trimmed_name from $data.username.trim() create $d.temperature from 100*23 on datatype changed skip
Returns one row for each distinct combination of the provided expressions.
distinct <expression> [as <alias>] [, <expression_2> [as <alias_2>], ...]
This operator is functionally identical to groupby
without any aggregate functions.
Enrich your logs using additional context from a lookup table.
Upload your lookup table using the Data Flow > Data Enrichment > Custom Enrichment section.
For more details, see Custom Enrichment documentation.
enrich <value_to_lookup> into <enriched_key> using <lookup_table>
value_to_lookup
– A string expression that will be looked up in the lookup table.enriched_key
– Destination key to store the enrichment result in.lookup_table
– The name of the Custom Enrichment table to be used.The table’s columns will be added as sub-keys to the destination key. If value_to_lookup
is not found, the destination key will be null
.
You can then filter the results using the DataPrime capabilities, such as filtering logs by specific value in the enriched field.
Example:
The original log:
{ "userid": "111", ... }
The Custom Enrichment lookup table called my_users
:
ID | Name | Department |
---|---|---|
111 | John | Finance |
222 | Emily | IT |
Running the following query:
enrich $d.userid into $d.user_enriched using my_users
Gives the following enriched log:
{ "userid": "111", "user_enriched": { "ID": "111", "Name": "John", "Department": "Finance" }, ... }
Notes:
source <lookup_table>
to view the enrichment table.<value_to_lookup>
exists in the <lookup_table>
, the sub-keys will be updated with the new value. If the <value_to_lookup>
does not exist, their current value will remain.<lookup_table>
will remain with their existing values.<lookup_table>
are considered to be strings. This means that:
<value_to_lookup>
must be in a string format.For more information, see the enrich section in the DataPrime Glossary.
Extract data from some string value into a new object. Multiple extraction methods are supported.
(e|extract) <expression> into <keypath> using <extraction-type>(<extraction-params>) [datatypes keypath:datatype,keypath:datatype,...]
Here are the currently supported extraction methods, and their parameters:
regexp
– Create a new object based on regexp capture-groupse
– A regular expression with names capture-groups.Example:
extract $d.my_text into $d.my_data using regexp(e=/user (?<user>.*) has logged in/)
kv
– Extract a new object from a string that contains key=value key=value...
pairspair_delimiter
– The delimiter to expect between pairs. Default is (a space)key_delimiter
– The delimiter to expect separating between a key and a value. Default is =
.Examples:
extract $d.text into $d.my_kvs using kv() e $d.text into $d.my_kvs using kv(pair_delimiter=' ',key_delimiter='=')
jsonobject
– Extract a new object from a string contains an encoded json object, potentially attempting to unescape the string before decoding it into a jsonmax_unescape_count
– Max number of escaping levels to unescape before parsing the json. Default is 1. When set to 1 or more, the engine will detect whether the value contains an escaped JSON string and unescape it until its parsable or max unescape count ie exceeded.Example:
e $d.json_message_as_str into $d.json_message using jsonobject(max_unescape_count=1)
Additional extraction methods will be supported in the future.
It is possible to provide datatype information as part of the extraction, by using the datatypes
clause. For example, adding datatypes my_field:number
to an extraction would cause the extract my_field
keypath to be a number instead of a string. For example:
extract $d.my_msg into $d.data using kv() datatypes my_field:number
Extracted data always goes into a new keypath as an object, allowing further processing of the new keys inside that new object. For example:
# Assuming a dataset which look like that: { "msg": "query_type=fetch query_id=100 query_results_duration_ms=232" } { "msg": "query_type=fetch query_id=200 query_results_duration_ms=1001" } # And the following DataPrime query: source logs | extract $d.msg into $d.query_data using kv() datatypes query_results_duration_ms:number | filter $d.query_data.query_results_duration_ms > 500 # The results will contain only the second message, in which the duration is larger than 500 ms
Filter events, leaving only events for which the condition evaluates to true.
(f|filter|where) <condition-expression>
Examples:
f $d.radius > 10 filter $m.severity.toUpperCase() == 'INFO' filter $l.applicationname == 'recommender' filter $l.applicationname == 'myapp' && $d.msg.contains('failure')
Note: Comparison with null currently works only for scalar values and will always return null on json subtrees.
Groups the results of the preceding operators by the specified grouping expressions and calculates aggregate functions for every group created.
groupby <grouping_expression> [as <alias>] [, <grouping_expression_2> [as <alias_2>], ...] [calculate] <aggregate_function> [as <result_keypath>] [, <aggregate_function_2> [as <result_keypath_2], ...]
For example, the following query:
groupby $m.severity calculate sum($d.duration)
Will result in logs of the following form:
{ "severity": "Warning", "_sum": 17045 }
The keypaths for the grouping expressions will always be under $d
. Using the as
keyword, we can rename the keypath for the grouping expressions and aggregation functions. The following query:
groupby $l.applicationname as $d.app calculate sum($d.duration) as $d.sum_duration
Will result in logs of the following form:
{ "app": "web-api", "sum_duration": 17045 }
Notes:
avg
, max
, sum
) to the bucket of results. This feature gives you the power to manipulate an aggregation expression inside the expression itself, allowing you to calculate and manipulate your data simultaneously. Examples of DataPrime expressions in aggregations can be found here.Limits the output to the first <event-count>
events.
limit <event-count>
Examples
limit 100
Move a key (including its child keys, if any) to a new location.
(m|move) <source-keypath> to <target-keypath>
Examples:
move $d.my_data.hostname to $d.my_new_data.host m $d.kubernetes.labels to $d.my_labels
Sort the data by ascending/descending order of the expression value. Ordering by multiple expressions is supported.
(orderby|sortby|order by|sort by) <expression> [(asc|desc)] , ...
Examples:
orderby $d.myfield.myfield orderby $d.myfield.myfield:number desc sortby $d.myfield desc
Note: Sorting numeric values can be done by casting expression to the type: e.g.<expression>: number
. In some cases, this will be inferred automatically by the engine.
Replace all substrings matching a regexp pattern from some keypath value, effectively hiding the original content.
The matching keyword is optional and can be used to increase readability.
redact <keypath> [matching] /<regular-expression>/ to '<redacted_str>' redact <keypath> [matching] <string> to '<redacted_str>'
Examples:
redact $d.mykey /[0-9]+/ to 'SOME_INTEGER' redact $d.mysuperkey.user_id 'root' to 'UNKNOWN_USER' redact $d.mysuperkey.user_id matchingn 'root' to 'UNKNOWN_USER'
Remove a keypath from the object.
r|remove <keypath1> [ "," <keypath2> ]...
Examples:
r $d.mydata.unneeded_key remove $d.mysuperkey.service_name, $d.mysuperkey.unneeded_key
Replace the value of some key with a new value.
If the replacement value changes the datatype of the keypath, the following will happen:
skip
– The replacement will be ignoredfail
– The query will failoverwrite
– The new value will overwrite the previous one, changing the datatype of the keypathreplace <keypath> with <expression> [on datatype changed skip/fail/overwrite]
Examples:
replace $d.message with null replace $d.some_superkey.log_length_plus_10 with $d.original_log.length()+10 on datatype changed overwrite
Rounds the time of the event into some time interval, possibly creating a new key for the result.
If source-timestamp
is not provided, then $m.timestamp
is used as the source timestamp.
If source-timestamp
is provided, it should be of type (or cast to) timestamp
.
By default, the rounded result is written back to the source keypath [source-timestamp]
.
If into <target-keypath>
is provided, then [source-timestamp]
is not modified, and the result is written to a new target-keypath
.
Supported time intervals are:
Xns
– X nanoseconds (beware of the source-timestamp
‘s resolution)Xms
– X millisecondsXs
– X secondsXm
– X minutesXh
– X hoursXd
– X daysAnd any combination of the above from bigger to smaller time unit, e.g. 1h30m15s
.
roundtime [source-timestamp] to <time-interval> [into <target-keypath>]
Examples:
roundtime to 1h into $d.tm roundtime $d.timestamp to 1h roundtime $d.my_timestamp: timestamp to 60m roundtime to 60s into $d.rounded_ts_to_the_minute
Set the data source that your DataPrime query is based on.
(source|from) <data_store>
Where <data_store>
can be either:
logs
spans
(supported only in the API)Examples:
source logs
No grouping variation
Limits the rows returned to a specified number and order the result by a set of expressions.
order_direction := "descending"/"ascending" according to top/bottom top <limit> <result_expression1> [as <alias>] [, <result_expression2> [as <alias2>], ...] by <orderby_expression> [as alias>]
For example, the following query:
top 5 $m.severity as $d.log_severity by $d.duration
Will result in logs of the following form:
[ { "log_severity": "Warning", "duration": 2000 }, { "log_severity": "Debug", "duration": 1000 } ... ]
Grouping variation
Limits the rows returned to a specified number and group them by a set of aggregation expressions and order them by a set of expressions.
order_direction := "descending"/"ascending" according to top/bottom top <limit> <(groupby_expression1|aggregate_function1)> [as <alias>] [, <(groupby_expression2|aggregate_function2)> [as <alias2>], ...] by <(groupby_expression1|aggregate_function1)> [as <alias>]
For example, the following query:
top 10 $m.severity, count() as $d.number_of_severities by avg($d.duration) as $d.avg_duration
Will result in logs of the following form:
[ { "severity": "Debug", "number_of_severities": 10, avg_duration: 2000 } { "severity": "Warning", "number_of_severities": 50, avg_duration: 1000 }, ... ]
Supported aggregation functions are listed in “Aggregation Functions” section.
Search for the string in a certain keypath.
(find|text) <free-text-string> in <keypath>
Examples:
find 'host1000' in $d.kubernetes.hostname text 'us-east-1' in $d.msg
A generic lucene-compatible operator, allowing both free and wild text searches, and more complex search queries.
Field names inside the lucene query are relative to $d
(the root level of user-data).
lucene <lucene-query-as-a-string>
Examples:
lucene 'pod:recommender AND (is_error:true or status_code:404)'
Search for the string in the entire user data. This can be used when the keypath in which the text resides is unknown.
Note: The performance of this operator is worse than when using the find
/text
operator. Prefer using those operators when you know the keypath to search for.
(wildfind/wildtext) <string>
Examples:
wildfind 'my-region' wildfind ':9092'
DataPrime supports a limited set of javascript constructs that can be used in expressions.
The data is exposed using the following top-level fields:
$m
– Event metadata
timestamp
severity
– Possible values are Verbose
, Debug
, Info
, Warning
, Error
, Critical
priorityclass
– Possible values are high
, medium
, low
logid
$l
– Event labels
applicationname
subsystemname
category
classname
computername
methodname
threadid
ipaddress
$d
– The user’s dataAccessing nested data is done by using a keypath, similar to any programming language or json tool. Keys with special characters can be accessed using a map-like syntax, with the key string as the map index, e.g. $d.my_superkey['my_field_with_a_special/character']
.
Examples:
$m.timestamp $d.my_superkey.myfield $d.my_superkey['my_field_with_a_special/character'] $l.applicationname
All standard language constructs are supported:
&&
, ||
, !
concat
(string interpolation will be supported soon)$d.temperature:number
. Type inference is automatically applied when possible. We’ll support full type-inference soon, reducing the need for casting.Boolean expressions for text search:
$d.field ~ 'text phrase'
– case-insensitive search for a text phrase in a specific field.$d ~~ 'text phrase'
– case-insensitive search for a text phrase in $d
.Various functions can be used to transform values. All functions can be called as methods as well, e.g. $d.msg.contains('x')
is equivalent to contains($d.msg,'x')
.
chr(number: number): string
Returns the Unicode code point number
as a single character string.
codepoint(string: string): number
Returns the Unicode code point of the only character of string
.
concat(value: string, ...values: string): string
Concatenates multiple strings into one.
contains(string: string, substring: string): bool
Returns true
if substring
is contained in string
endsWith(string: string, suffix: string): bool
Returns true
if string
ends with suffix
indexOf(string: string, substring: string): number
Returns the position of substring
in string
, or -1
if not found.
length(value: string): number
Returns the length of value
ltrim(value: string): string
Removes whitespace to the left of the string value
matches(string: string, regexp: regexp): bool
Evaluates the regular expression pattern and determines if it is contained within string.
Alias for padLeft
pad(value: string, charCount: number, fillWith: string): string
Left pads string to charCount. If size < fillWith.length()
of string, result is truncated. See padLeft for more details.
padLeft(value: string, charCount: number, fillWith: string): string
Left pads string to charCount. If size < fillWith.length()
of string, result is truncated.
padRight(value: string, charCount: number, fillWith: string): string
Right pads string to charCount. If size < fillWith.length()
of string, result is truncated.
regexpSplitParts(string: string, delimiter: regexp, index: number): string
Splits string on regexp-delimiter, returns the field at index. Indexes start with 1.
rtrim(value: string): string
Removes whitespace to the right of the string value
splitParts(string: string, delimiter: string, index: number): string
Splits string on delimiter, returns the field at index. Indexes start with 1.
startsWith(string: string, prefix: string): bool
Returns true
if string
starts with prefix
substr(value: string, from: number, length: number?): string
Returns the substring in value
, from position from
and up to length length
toLowerCase(value: string): string
Converts value
to lowercase
toUpperCase(value: string): string
Converts value
to uppercase
trim(value: string): string
Removes whitespace from the edges of a string value
ipInSubnet(ip: string, ipPrefix: string): bool
Returns true if ip is in the subnet of ipPrefix.
ipPrefix(ip: string, subnetSize: number): string
Returns the IP prefix of a given ip_address with subnetSize bits (e.g.: 192.128.0.0/9
).
`this is an interpolated {$d.some_keypath} string`
– {$d.some_keypath}
will be replaced with the evaluated expression that is wrapped by the brackets`this is how you escape \{ and \} and \``
– Backward slash (\
) is used to escape characters like {
, }
that are used for keypaths.isUuid(uuid: string): bool
Returns true if uuid is valid.
randomUuid(): string
Returns a random UUIDv4.
Deprecated: use randomUuid
instead
uuid(): string
Returns a random UUIDv4. See randomUuid for more details.
firstNonNull(value: any, ...values: any): any
Returns the first non-null value from the parameters. Works only on scalars for now.
if(condition: bool, then: any, else: any?): any
return either the then
or else
according to the result of condition
in(comparand: any, value: any, ...values: any): bool
Tests if the comparand
is equal to any of the values in a set v1 ... vN
.
recordLocation(): string
Returns the location of the record (e.g.: s3 URL)
abs(number: number): number
Returns the absolute value of number
ceil(number: number): number
Rounds the value up to the nearest integer
e(): number
Returns the constant Euler’s number.
floor(number: number): number
Rounds the value down to the nearest integer
fromBase(string: string, radix: number): number
Returns the value of string
interpreted as a base-radix number.
ln(number: number): number
Returns the natural log of number
log(base: number, number: number): number
Returns the log of number
in base base
log2(number: number): number
Returns the log of number
in base 2. Equivalent to log(2, number)
max(value: number, ...values: number): number
Returns the largest number of all the numbers passed to the function
min(value: number, ...values: number): number
Returns the smallest number of all the numbers passed to the function
mod(number: number, divisor: number): number
Returns the modulus (remainder) of number
divided by divisor
.
pi(): number
Returns the constant Pi.
power(number: number, exponent: number): number
Returns number^exponent
random(): number
Returns a pseudo-random value in the range 0.0 <= x < 1.0
.
randomInt(upperBound: number): number
Returns a pseudo-random integer number between 0 and n (exclusive)
round(number: number, digits: number?): number
Round number
to digits
decimal places
sqrt(number: number): number
Returns square root of a number.
toBase(number: number, radix: number): string
Returns the base-radix representation of number
.
urlDecode(string: string): string
Unescapes the URL encoded in string.
urlEncode(string: string): string
Escapes string by encoding it so that it can be safely included in URL.
Functions for processing timestamps, intervals and other time-related constructs.
Many date/time functions accept a time unit argument to tweak their behaviour. Dataprime supports time units from nanoseconds to days. They are represented as literal strings of the time unit name in either long or short notation:
'day'
, 'hour'
, 'minute'
, 'second'
, 'milli'
, 'micro'
, 'nano'
'd'
, 'h'
, 'm'
, 's'
, 'ms'
, 'us'
, 'ns'
Dataprime timestamps are always stored in the UTC time zone, but some date/time functions accept a time zone argument to tweak their behaviour. Time zone arguments are strings that specify a time zone offset, shorthand or identifier:
'+01'
or '-02'
)'+0130'
or '-0230'
)'+01:30'
or '-02:30'
)'UTC'
, 'GMT'
, 'EST'
, etc.)'Asia/Yerevan'
, 'Europe/Zurich'
, 'America/Winnipeg'
, etc.)addInterval(left: interval, right: interval): interval
Adds two intervals together. Works also with negative intervals. Equivalent to left + right
.
addTime(t: timestamp, i: interval): timestamp
Adds an interval to a timestamp. Works also with negative intervals. Equivalent to t + i
.
diffTime(to: timestamp, from: timestamp): interval
Calculates the duration between two timestamps. Positive if to > from
, negative if to < from
. Equivalent to to - from
.
extractTime(timestamp: timestamp, unit: dateunit | timeunit, tz: string?): number
Extracts either a date or time unit from a timestamp
. Returns a floating point number for time units smaller than a 'minute'
, otherwise an integer. Date units such as 'month'
or 'week'
start from 1 (not from 0).
Function parameters:
timestamp
(required) – the timestamp to extract from.unit
(required) – the date or time unit to extract. Must be a string literal and one of:
'year'
, 'month'
, 'week'
, 'day_of_year'
, 'day_of_week'
'Y'
, 'M'
, 'W'
, 'doy'
, 'dow'
tz
(optional) – a time zone to convert the timestamp before extracting.Example 1: extract the hour in Tokyo
limit 1 | choose $m.timestamp.extractTime('h', 'Asia/Tokyo') as h # Result 1: 11pm { "h": 23 }
Example 2: extract the number of seconds
limit 1 | choose $m.timestamp.extractTime('second') as s # Result 2: 38.35 seconds { "s": 38.3510265 }
Example 3: extract the timestamp’s month
limit 1 | choose $m.timestamp.extractTime('month') as m # Result 3: August { "m": 8 }
Example 4: extract the day of the week
limit 1 | choose $m.timestamp.extractTime('dow') as d # Result 4: Tuesday { "d": 2 }
formatInterval(interval: interval, scale: timeunit?): string
Formats interval
to a string with an optional time unit scale
.
Function parameters:
interval
(required) – the interval to format.scale
(optional) – the largest time unit of the interval to show. Defaults to nano
.Example:
limit 3 | choose formatInterval(now() - $m.timestamp, 's') as i # Results: { "i": "122s261ms466us27ns" } { "i": "122s359ms197us227ns" } { "i": "122s359ms197us227ns" }
formatTimestamp(timestamp: timestamp, format: string?, tz: string?): string
Formats a timestamp
to a string with an optional format specification and destination time zone.
Function parameters:
timestamp
(required) – the timestamp to format.format
(optional) – a date/time format specification for parsing timestamps. Defaults to 'iso8601'
. The format can be any string with embedded date/time formatters, or one of several shorthands. Here are a few samples:
'%Y-%m-%d'
– print the date only, e.g. '2023-04-05'
'%H:%M:%S'
– print the time only, e.g. '16:07:33'
'%F %H:%M:%S'
– print both date and time, e.g. '2023-04-05 16:07:33'
'iso8601'
– print a timestamp in ISO 8601 format, e.g. '2023-04-05T16:07:33.123Z'
'timestamp_milli'
– print a timestamp in milliseconds (13 digits), e.g. '1680710853123'
tz
(optional) – the destination time zone to convert the timestamp before formatting.Example 1: print a timestamp with default format and +5h offset
limit 1 | choose $m.timestamp.formatTimestamp(tz='+05') as ts # Result 1: { "ts": "2023-08-29T19:08:37.405937400+0500" }
Example 2: print only the year and month
limit 1 | choose $m.timestamp.formatTimestamp('%Y-%m') as ym # Result 2: { "ym": "2023-08" }
Example 3: print only the hours and minutes
limit 1 | choose $m.timestamp.formatTimestamp('%H:%M') as hm # Result 3: { "hm": "14:11" }
Example 4: print a timestamp in milliseconds (13 digits)
limit 1 | choose $m.timestamp.formatTimestamp('timestamp_milli') as ms # Result 4: { "ms": "1693318678696" }
fromUnixTime(unixTime: number, timeUnit: timeunit?): timestamp
Converts a number of a specific time units since the UNIX epoch to a timestamp (in UTC). The UNIX epoch starts on January 1, 1970 – earlier timestamps are represented by negative numbers.
Function parameters:
unixTime
(required) – the amount of time units to convert. Can be either positive or negative and will be rounded down to an integer.timeUnit
(optional) – the time units to convert. Defaults to 'milli'
.Example:
limit 1 | choose fromUnixTime(1658958157515, 'ms') as ts # Result: { "ts": 1658958157515000000 }
multiplyInterval(i: interval, factor: number): interval
Multiplies an interval by a numeric factor
. Works both with integer and fractional numbers. Equivalent to i * factor
now(): timestamp
Returns the current time at query execution time. Stable across all rows and within the entire query, even when used multiple times. Nanosecond resolution if the runtime supports it, otherwise millisecond resolution.
Example:
limit 3 | choose now() as now, now() - $m.timestamp as since # Results: { "now": 1693312549105874700, "since": "14m954ms329us764ns" } { "now": 1693312549105874700, "since": "14m954ms329us764ns" } { "now": 1693312549105874700, "since": "14m960ms519us564ns" }
parseInterval(string: string): interval
Parses an interval from a string
with format NdNhNmNsNmsNusNns
where N
is the amount of each time unit. Returns null
when the input does not match the expected format:
'd'
, 'h'
, 'm'
, 's'
, 'ms'
, 'us'
, 'ns'
.-
to represent negative intervals.Example 1: parse a zero interval
limit 1 | choose '0s'.parseInterval() as i # Result 1: { "i": "0ns" }
Example 2: parse a positive interval
limit 1 | choose '1d48h0m'.parseInterval() as i # Result 2: { "i": "3d" }
Example 3: parse a negative interval
limit 1 | choose '-5m45s'.parseInterval() as i # Result 3: { "i": "-5m45s" }
parseTimestamp(string: string, format: string?, tz: string?): timestamp
Parses a timestamp from string
with an optional format specification and time zone override. Returns null
when the input does not match the expected format.
Function parameters:
string
(required) – the input from which the timestamp will be extracted.format
(optional) – a date/time format specification for parsing timestamps. Defaults to 'auto'
. The format can be any string with embedded date/time extractors, one of several shorthands, or a cascade of formats to be attempted in sequence. Here are a few samples:
'%Y-%m-%d'
– parse date only, e.g. '2023-04-05'
'%F %H:%M:%S'
– parse date and time, e.g. '2023-04-05 16:07:33'
'iso8601'
– parse a timestamp in ISO 8601 format, e.g. '2023-04-05T16:07:33.123Z'
'timestamp_milli'
– parse a timestamp in milliseconds (13 digits), e.g. '1680710853123'
'%m/%d/%Y|timestamp_second'
– parse either a date or a timestamp in seconds, in that ordertz
(optional) – a time zone override to convert the timestamp while parsing. This parameter will override any time zone present in the input. A time zone can be extracted from the string by using an appropriate format and omitting this parameter.Example 1: parse a date with the default format
limit 1 | choose '2023-04-05'.parseTimestamp() as ts # Result 1: { "ts": 1680652800000000000 }
Example 2: parse a date in US format
limit 1 | choose '04/05/23'.parseTimestamp('%D') as ts # Result 2: { "ts": 1680652800000000000 }
Example 3: parse date and time with units
limit 1 | choose '2023-04-05 16h07m'.parseTimestamp('%F %Hh%Mm') as ts # Result 3: { "ts": 1680710820000000000 }
Example 4: parse a timestamp in seconds (10 digits)
limit 1 | choose '1680710853'.parseTimestamp('timestamp_second') as ts # Result 4: { "ts": 1680710853000000000 }
Deprecated: use parseTimestamp
instead
parseToTimestamp(string: string, format: string?, tz: string?): timestamp
Parses a timestamp from string
with an optional format specification and time zone override. See parseTimestamp for more details.
roundInterval(interval: interval, scale: timeunit): interval
Rounds an interval to a time unit scale
. Smaller time units will be zeroed out.
Function parameters:
interval
(required) – the interval to round.scale
(required) – the largest time unit of the interval to keep.Example:
limit 1 | choose 2h5m45s.roundInterval('m') as i # Result: { "i": "2h5m" }
roundTime(date: timestamp, interval: interval): timestamp
Rounds a timestamp to the given interval. Useful for bucketing, e.g. rounding to 1h
for hourly buckets. Equivalent to date / interval
.
Example:
groupby $m.timestamp.roundTime(1h) as bucket count() as n # Results: { "bucket": "29/08/2023 15:00:00.000 pm", "n": 40653715 } { "bucket": "29/08/2023 14:00:00.000 pm", "n": 1779386 }
subtractInterval(left: interval, right: interval): interval
Subtracts one interval from another. Equivalent to addInterval(left, -right)
and left - right
.
subtractTime(t: timestamp, i: interval): timestamp
Subtracts an interval from a timestamp. Equivalent to addTime(t, -i)
and t - i
.
Deprecated: use roundTime
instead
timeRound(date: timestamp, interval: interval): timestamp
Rounds a timestamp to the given interval. See roundTime for more details.
toInterval(number: number, timeUnit: timeunit?): interval
Converts a number
of specific time units to an interval. Works with both integer / floating point and positive / negative numbers.
Function parameters:
number
(required) – the amount of time units to convert.timeUnit
(optional) – Time units to convert. Defaults to nano
.Example 1: convert a floating point number
limit 1 | choose 2.5.toInterval('h') as i # Result 1: { "i": "2h30m" } # Example 2: convert an integer number limit 1 | choose -9000.toInterval() as i # Result 2: { "i": "-9us" }
Deprecated
toIso8601DateTime(timestamp: timestamp): string
Alias to formatTimestamp(timestamp, 'iso8601')
.
Formats timestamp
to an ISO 8601 string with nanosecond output precision.
Example:
limit 1 | choose $m.timestamp.toIso8601DateTime() as ts # Result: { "ts": "2023-08-11T07:29:17.634Z" }
toUnixTime(timestamp: timestamp, timeUnit: timeunit?): number
Converts timestamp
to a number of specific time units since the UNIX epoch (in UTC). The UNIX epoch starts on January 1, 1970 – earlier timestamps are represented by negative numbers.
Function parameters:
timestamp
(required) – the timestamp to convert.timeUnit
(optional) – the time units to convert to. Defaults to 'milli'
.Example:
limit 1 | choose $m.timestamp.toUnixTime('hour') as hr # Result: { "hr": 470363 }
decodeBase64(value: string): string
Decode a base-64 encoded string
encodeBase64(value: string): string
Encode a string into base-64
Case expressions are special constructs in the language that allow choosing between multiple options in an easy manner and in a readable way. They can be wherever an expression is expected.
Choose between multiple values based on several generic conditions. Resort to a default-value
if no condition is met.
case { condition1 -> value1, condition2 -> value2, ... conditionN -> valueN, _ -> <default-value> }
Example:
case { $d.status_code == 200 -> 'success', $d.status_code == 201 -> 'created', $d.status_code == 404 -> 'not-found', _ -> 'other' } # Here's the same example inside the context of a query. A new field is created with the `case` result, # and then a filter will be applied, leaving only non-successful responses. source logs | ... | create $d.http_response_outcome from case { $d.status_code == 200 -> 'success', $d.status_code == 201 -> 'created', $d.status_code == 404 -> 'not-found', _ -> 'other' } | filter $d.http_response_outcome != 'success'
A shorthand for case
which allowing checking if a string s
contains one of several substrings without repeating the expression leading to s
. The chosen value is the first which matches s.contains(substring)
.
case_contains { s: string, substring1 -> result1, substring2 -> result2, ... substring3 -> resultN }
Example:
case_contains { $l.subsystemname, '-prod-' -> 'production', '-dev-' -> 'development', '-stg-' -> 'staging', _ -> 'test' }
A shorthand for case
which allowing comparing some expression e
to several results without repeating the expression. The chosen value is the first which matches s == value
case_equals { e: any, value1 -> result1, value2 -> result2, ... valueN -> resultN }
Example:
case_equals { $m.severity, 'info' -> true, 'warning -> true, _ -> false }
A shorthand for case
which allows comparing n
to multiple values without repeating the expression leading to n
. The chosen value is the first which matches expression > value
.
case_greaterthan { n: number, value1: number -> result1, value2: number -> result2, ... valueN: number -> resultN, _ -> <default-value> }
Example:
case_greaterthan { $d.status_code, 500 -> 'server-error', 400 -> 'client-error', 300 -> 'redirection', 200 -> 'success', 100 -> 'information', _ -> 'other' }
A shorthand for case
which allows comparing a number n
to multiple values without repeating the expression leading to n
. The chosen value is the first which matches expression < value
.
case_lessthan { n: number, value1: number -> result1, value2: number -> result2, ... valueN: number -> resultN, _ -> <default-value> }
Example:
case_lessthan { $d.temperature_celsius, 10 -> 'freezing', 20 -> 'cold', 30 -> 'fun', 45 -> 'hot', _ -> 'burning' }
Returns any non-null expression value in the group. If expression is not defined, it defaults to the $data
object.
any_value(expression: any?)
Returns null if all expression values in the group are null.
Example:
groupby $m.severity calculate any_value($d.url)
Calculates the average value of a numerical expression in the group.
avg(expression: number)
Example:
groupby $m.severity calculate avg($d.duration) as average_duration
Counts non-null expression values. If expression is not defined, all rows will be counted.
count(expression: any?) [into <keypath>]
An alias can be provided to override the keypath the result will be written to.
For example, the following part of a query
count() into $d.num_rows
will result in a single row of the following form:
{ "num_rows": 7532 }
Counts non-null expression values on rows which satisfy condition. If expression is not defined, all rows that satisfy condition will be counted.
count_if(condition: bool, expression: any?)
Example:
groupby $m.severity calculate count_if($d.duration > 500) as $d.high_duration_logs groupby $m.severity calculate count_if($d.duration > 500, $d.company_id) as $d.high_duration_logs
Counts non-null distinct expression values.
distinct_count(expression: any)
Example:
groupby $l.applicationname calculate distinct_count($d.username) as active_users
Counts non-null distinct expression values on rows which satisfy condition.
distinct_count_if(condition: bool, expression: any)
Example:
groupby $l.applicationname calculate distinct_count_if($m.severity == 'Error', $d.username) as users_with_errors
Calculates the maximum value of a numerical expression in the group.
max(expression: number | timestamp)
Example:
groupby $m.severity calculate max($d.duration)
Calculates the minimum value of a numerical expression in the group.
min(expression: number | timestamp)
Example:
groupby $m.severity calculate min($d.duration)
Calculates the approximate n-th percentile value of a numerical expression in the group.
percentile(percentile: number, expression: number, error_threshold: number?)
Since the percentile calculation is approximate, the accuracy may be controlled with the error_threshold
parameter which ranges from 0
to 1
(defaults to 0.01
). A lower value will result in better accuracy at the cost of longer query times.
Example:
groupby $m.severity calculate percentile(0.99, $d.duration) as p99_latency
Computes the sample standard deviation of a numerical expression in the group.
sample_stddev(expression: number)
Example:
groupby $m.severity calculate sample_stddev($d.duration)
Computes the variance of a numerical expression in the group.
sample_variance(expression: number)
Example:
groupby $m.severity calculate sample_variance($d.duration)
Computes the standard deviation of a numerical expression in the group.
stddev(expression: number)
Example:
groupby $m.severity calculate stddev($d.duration)
Calculates the sum of a numerical expression in the group.
sum(expression: number)
Example:
groupby $m.severity calculate sum($d.duration) as total_duration
Computes the variance of a numerical expression in the group.
variance(expression: number)
Example:
groupby $m.severity calculate variance($d.duration)
When querying with the groupby operator, you can now apply an aggregation function (such asavg, max, sum) to the bucket of results. This feature gives you the power to manipulate an aggregation expression inside the expression itself, allowing you to calculate and manipulate your data simultaneously.
Example 1
This examples takes logs which have some connect_duration
and batch_duration
fields, and calculates the ratio between the averages of those durations, per region
.
# Query source logs | groupby region aggregate avg(connect_duration) / avg(batch_duration)
Example 2
This query calculates the percentage of logs which don’t have a kubernetes_pod_name
out of the total number of logs. The calculation is done per subsystem.
# Query source logs | groupby $l.subsystemname aggregate sum(if(kubernetes.pod_name != null,1,0)) / count() as pct_without_pod_name
Example 3
This query calculates the ratio between the maximum and minimum salary per department, and provides a Based on N Employees
string as an additional column per row.
# Query source logs | groupby department_id aggregate max(salary) / min(salary) as salary_ratio `Based on {count()} Employees`)
Example 4
This query calculates the ratio between error logs and info logs.
source logs | groupby $m.timestamp / 1h as hour aggregate count_if($m.severity == '5') / count_if($m.severity == '3') as error_to_info_ratio
Need help?
Our world-class customer success team is available 24/7 to answer any questions that may come up.
Feel free to reach out to us via our in-app chat or by sending us an email at [email protected].