DataPrime is Coralogix’s next-generation query language. It’s a piped language that provides users with a simple yet powerful way to describe event transformations and aggregations. The balance between simplicity and power is achieved by having a rather small set of idioms that encapsulate event structure transformation while supporting the use of standard JavaScript expressions to describe value transformations.
DataPrime is currently enabled to ‘Explore’ your logs, in the Archive’ mode (Please note: to query your archive with DataPrime, make sure to enable the CX-Data format bucket.)
Find a list of namespaces, example expressions, operator syntax, and more in our DataPrime Quick-Start Guide.
[NEW] DataPrime now supports Data Aggregation, for more information and examples please refer to the DataPrime Cheat Sheet.
DataPrime and Lucene are both optional for querying your Archive and Logs (Under “Explore”). You should click the currently active language label toggle between the two languages, Clicking <>Lucene would switch to <>DataPrime and vice versa.
While in DataPrime mode, 2 additional buttons are enabled:
– Cheat sheet: A detailed sheet that includes all the schemes and language basics with examples
– Query History: For reusing your historical DataPrime queries
A query is composed of multiple stages, e.g. (Do X and then do Y and then…). The syntax is essentially based on bash-like pipes where each stage’s output is piped into the next one.
DataPrime can handle fully-nested data. Nested keys are written as ‘keypaths’, (i.e. key.subkey.subkey
) and are handled in a granular way, meaning that operations happen only on the relevant keys, leaving other nested keys intact.
For example, creating a new keypath stats.mykey
will either create a new key called mykey
in an existing stats
superkey, or create the entire path – a top-level object called stats
and within it, a subkey called mykey
.
The language contains a small set of idioms for structure transformation. A large part of its power comes from the ability to use JavaScript-like expressions in various places throughout the language. This allows for describing rich value transformations without resorting to special language-constructs, or to actual code.
Several predefined scopes/namespaces are available for expressions. The main ones are the following:
$d / $data
The user-data. For raw data, it’s the event data itself, but after aggregations, this could be the aggregation results
$m / $metadata
Engine-related event metadata, such as the timestamp
and the logid
$l / $labels
User-managed event labels. Flat, key/values (strings only)
Refer to the my_text
field in the input:
$d.my_text
Refer to the key key
inside the key stats
:
$d.stats.key
The result of multiplying the value of the radius
key and 8:
$d.radius * 8
The logical timestamp of the event:
$m.timestamp
The application name of the event:
$l.applicationName
Evaluated expressions have a dynamic data type, similar to any javascript code. It’s the job of DataPrime to track these data types when they’re applied as values of keys.
Data extractions are natively supported by the language, and are extendable, meaning that multiple types of extractions are supported, and new ones can be added without changing the structure of the language.
Examples of extraction types:
Extract a string into a new object containing captured data from the string:
regexp
Extracting key-value pairs from a string into a new object:
kv
Creating a new object from a json encoded as a string:
jsonobject
Splitting a string into a new array of native elements:
split
A Store
is the definition of some storage mechanism for data. This could be a Kafka topic or an S3 location, for example, and includes metadata about the content structure, schema, and primary key (used for enrichments).