DataPrime fair usage limits
The DataPrime query language is varied and powerful, but there are some key fair usage limits that must be understood. Many of these are limits that are so high, you'll never run into them, but they should guide your thinking when crafting DataPrime queries.
Scan limit exceeded
Your scan limit is the amount of data you are permitted to scan as part of a query. Scanning is an essential part of any query, because it is the mechnanism by which individual documents are searched to test if they match some condition.
For Frequent Search
data, your maximum scan size is 100MB
.
For all other kinds of data, the limit is much higher, and will scale with your account as you ingest more data.
Troubleshooting scan limit exceeded
Most often, the best remedy for scan limit exceeded is to simply switch from querying Frequent Search
data to Monitoring
or Compliance
by selecting All Logs
or All Traces
in their respective UIs.
Additionally, the DataPrime query engine will attempt to minimize the amount of data scanned through smart optimization, but some types of queries cannot be optimized sufficiently and will scan a lot of data regardless.
For example, consider the following query:
extract msg into action_data using regexp(e=/(?<user>.*) did (?<action>.*) action/)
| filter user == 'Chris'
This approach create a new field for the purposes of filtering, where it would have been easier and less data intensive to simply search the msg
field:
While these operations aren't entirely equivalent, these types of optimizations can give the DataPrime query engine more freedom to reduce query scan volume.
NOTE: This example is illustrative. Optimizations are heavily dependent on your data structure.
Expression limits
Each expression is subject to a set of parsing limits. These limits are designed to be invisible to handwritten queries, and are only in place to prevent extremely complex autogenerated queries from being executed.
Maximum depth
Depth in DataPrime is determined by the number of levels a parsed expression is resolved to. Consider the following query:
When DataPrime parses this query, it is initially optimized and transformed into an Abtract Syntax Tree (AST). Viewing depth is much easier when the query is viewed in this form:
In this format, it's clear to see that there are 3 tiers to the AST. This means that the query has a depth of 3.
Maximum length
Length in DataPrime is defined by the number of nodes in a given query. For example, the following query has 4 nodes. One node per value and one extra for the root node in the AST:
Again, this is easier to visualize as an AST:
DataPrime supports a maximum length (or AST Node Count) of 50,000. In real terms, this means you will likely never need to worry about this.
Managing length
If you are running into the 50k limit on length, there are some things you can do:
- Consider breaking up your processing into multiple queries where possible
- Consider replacing any functionality you've written with an inbuilt function that requires fewer nodes in the AST to be expressed.
High tier tokenization
In high tier, Coralogix saves text fields longer than 256 symbols only in tokenized form, without special characters and stop words.
Most functions in DataPrime do not work with the tokenized form, which means they will ultimately return null
. One notable exception is textSearch
which operates on tokens by default.
Key action and best practices
Very long string
values should be broken up using Coralogix Parsing Rules, to ensure they don't exceed 256 characters. Not only will this minimize incompatibility with DataPrime functions, but it will also aid in query performance and make your data much more readable.
No keypath adjustments
Dataprime does not have keypath adjustments. If a keypath contains dots, you are required to use bracket access syntax to refer to this keypath in archive mode. For example, consider the following document:
The field itself contains .
values. This means that the following query would not work as expected:
Instead, you should use the []
syntax for accessing a field:
This is also true for any other fields containing special characters, like fields containing a whitespace: