How to use DataPrime to isolate and shape logs for deeper analysis
Goal
By the end of this guide, you should be able to use filter
, block
, choose
, and create
to isolate relevant log data, transform it, and prepare it for further analysis.
Why it matters
When debugging issues or investigating anomalies, you’ll rarely get what you need from a single filter
. Real investigations require peeling back layers: filtering what matters, cutting what doesn’t, shaping the remaining data, and adding context for further questions. This guide shows you how to combine multiple DataPrime commands into a focused, intermediate-level workflow.
Filter logs based on key conditions
Use filter
to include only logs that match your criteria. This is your first pass—tightening the lens to only look at relevant documents.
This example keeps only logs where the ip_address
belongs to a private subnet. Under the hood, ipInSubnet
checks whether a string-form IP falls within a given CIDR range. It's especially useful when you're interested in internal service communication or suspicious traffic in reserved blocks.
Filters are boolean expressions—if the result is true
, the log stays. You can combine conditions (&&
, ||
) or nest them to get precise.
Use block
to remove unwanted noise
Where filter
includes logs that match, block
excludes them. This is your second pass: trim what’s common or unhelpful.
This removes logs with 2xx
status codes—usually successful requests—so you can focus on errors or edge cases. block
is great for ignoring heartbeat events, noisy health checks, or other “happy path” scenarios that dilute your dataset.
Internally, it works by evaluating a boolean expression per log and discarding any for which the expression is true
.
Use choose
to reduce and standardize the shape
Most logs have far more fields than you need. Use choose
to extract just the fields you care about—and rename or transform them in the process.
Here, choose
keeps only a minimal set of fields: a standardized user ID, the request path, and the status. If your logs come from multiple sources with inconsistent naming, firstNonNull
helps you pick the first non-null variant and project it as a unified field.
This is especially helpful before exporting logs or aggregating—they’ll be smaller, faster to analyze, and easier to visualize.
Use create
to add computed or contextual fields
Once your data is clean and consistent, you can add derived fields to enrich it. create
lets you generate new fields based on expressions, lookups, or constants.
This adds a boolean field that marks whether a log came from internal IP space. You can later filter or group by this field without repeating the expression.
You can also use create
to tag logs with constants or randomly generated values, like so:
That’s useful when preparing logs for downstream export, tagging batches, or linking records during an investigation.
Putting it all together
Each of these commands does one thing well:
filter
gets you the logs you care aboutblock
removes the ones you don’tchoose
gives you a clean, minimal shapecreate
adds context, calculations, or structure
Here’s how they come together in a full workflow:
filter ipInSubnet(ip_address, '10.8.0.0/16')
| block status_code.startsWith('2')
| choose firstNonNull(user_id, userId, user_identifier) as canonical_user_id, path, status_code
| create is_internal from ipInSubnet(ip_address, '10.0.0.0/8')
| create analysis_batch_id from randomUuid()
This query gives you just the logs you need—internal traffic with errors—reduced to essential fields, tagged with metadata to track your analysis. It’s fast, expressive, and purpose-built.
Expected output
Your logs should now look something like this:
{
"canonical_user_id": "dave-123",
"path": "/api/checkout",
"status_code": "500",
"is_internal": true,
"analysis_batch_id": "f08a7a5e-83b7-42bd-9a1c-1098441a4c6a"
}
That’s a tight, structured document—ideal for grouping, exporting, or visualizing.