How to use DataPrime to enrich and reshape data on the fly
Goal
By the end of this guide you should be able to enrich documents using lookup tables, extract structured values from strings, parse key-value pairs, and explode arrays into separate documents.
Why it matters
Data in logs and traces is often messy, inconsistent, or incomplete. You may need to add metadata, normalize fields across schemas, parse semi-structured text, or reshape arrays into flat rows for easier analysis. DataPrime lets you do all of this at query time, without needing to preprocess or re-index.
These transformations are essential for debugging, auditing, and building clean, meaningful dashboards—even when your logs aren’t clean.
Enrich documents with lookup metadata (enrich
)
The enrich
command allows you to add contextual data from an external table—such as team assignments, department names, or locations—based on a key in your document. This is perfect for enriching logs with human-readable or operational metadata that isn't present in the original log stream. A custom enrichment table is required to use the enrich
command.
For example, if your log contains a userid
, you can enrich it with fields like name
and department
from the lookup table. The enriched data is attached as a nested object under user_data
.
Extract structured data from a string (extract
+ regexp
)
Logs often contain structured information hidden inside a string message. Use the extract
command with a regular expression to pull out useful fields like usernames, error codes, or transaction IDs. This makes them accessible for filtering, grouping, and display.
The named capture group (?<username>
) ensures that if the pattern matches, the username
is extracted into a new field under parsed_fields
.
Parse key-value strings into objects (extract
+ kv
)
If your data includes log lines, query strings, or parameters encoded as key-value pairs, the kv
extraction strategy is a fast way to parse them into structured fields. It works great for payloads formatted like a=b&c=d
.
This creates an object under query_params
, allowing you to reference values like query_params.user
or query_params.env
in downstream filters or transforms. You can also pair this with urlDecode()
to clean encoded values.
Explode arrays into multiple documents (explode
)
When a log contains an array—like user roles, IP addresses, or scope tags—you can use explode
to split it into separate documents, one per element. This makes the data much easier to analyze, aggregate, or filter.
With original preserve
, all other fields from the original log are kept. Each resulting document contains one value from the array assigned to the key scope
, which you can then group or filter on independently.
Expected output
After applying these transformations, your documents will be cleaner and more consistent. You’ll be able to:
- Join metadata into your logs based on user IDs or other keys.
- Pull meaningful fields out of messages for easier filtering and alerting.
- Convert encoded or blob-style strings into structured JSON.
- Flatten arrays into single-value documents for counting, grouping, and dashboards.
Common pitfalls or gotchas
enrich
only works if your lookup key is a string—cast it if needed.extract
usingregexp
will returnnull
if the pattern doesn't match.kv
extraction assumes consistent formatting—watch for missing delimiters or malformed strings.explode
overwrites destination fields if names collide—rename carefully.