Tracing is often the last thought in any observability strategy. While engineers prioritize logs and metrics, tracing is truly the hallmark of a mature observability platform,…
In modern observability, Lucene is the most commonly used language for log analysis. Lucene has earned its place as a query language. Still, as the industry demands change and the challenge of observability grows more difficult, Lucene’s limitations become more obvious.
Lucene is excellent for key value querying. For example, if I have a log with a field userId and I want to find all logs pertaining to the user Alex, then I can run a simple query: userId: Alex.
To understand Lucene limitations, ask a more advanced question: Who are the top 10 most active users on our site? Unfortunately, this is complex, requiring functionality that is not found in Lucene. So something new is necessary at this point. More than just a query language, observability needs a syntax that will help us explore new insights within our data.
DataPrime is the Coralogix query syntax that allows users to explore their data, perform schema on read transformations, group and aggregate fields, extract data, and much more. Let’s look at a few examples.
To answer a question like this, let’s break down our problem into stages:
Most of these activities are completely impossible in Lucene, so let’s explore how they look in DataPrime:
DataPrime transforms this complex problem into a flattened series of processes, allowing users to think about their data as it transforms through their query rather than nesting and forming complex hierarchies of functionality.
Extracting data in DataPrime is entirely trivial, using the extract command. This command allows users to transform unstructured data into parsed objects that are included as part of the schema (a capability known as schema on read). Extract supports a number of methods:
The following example shows how simple it is to use regular expressions to capture multiple values from unstructured data.
Logs often contain personal information. A common solution to this problem is to extract the data, redact it in another tool and send the redacted version. All this does is copy personal data and increase the attack surface. Instead, use DataPrime to redact data as it’s queried.
This makes it impossible for data to leak out of the system, and helps companies analyze their data while maintaining data integrity and confidentiality.
With access to a much more sophisticated set of tools, users can explore and analyze their data like never before. Don’t settle for simple queries and complex syntax. Flatten your processing, and generate entirely new fields on the fly using DataPrime.