## Goal

By the end of this guide, you should be able to:

- Recognize the core DataPrime commands and when to use them.
- Combine multiple commands to perform real-time filtering, transformations, and aggregations.
- Understand the structural difference between commands and functions in a query pipeline.

## Why it matters

Commands are the engine behind most DataPrime queries. Raw logs are rarely in the shape you need. Whether you're triaging an incident, generating a report, or building a dashboard, you’ll need to transform your data quickly and safely. Commands in DataPrime provide the building blocks to:

- filter out noise
- extract structure
- enrich and join values
- compute statistics
- reshape your dataset in real time

Mastering these tools is what turns a basic query into a flexible investigation or automation pipeline.

## Commands vs. functions

In DataPrime, **commands** are top-level operations that act on **rows and datasets**. They differ from **functions**, which transform individual values. Commands act on rows, fields, and entire document sets.

## Common patterns and syntax

Most commands appear at the **start of a line** and accept one or more **arguments or expressions**.

They’re usually chained with the pipe (`|`) operator like with `filter`, `groupby`, and `top` in the following example:

```dataprime
source logs
| filter status_code >= 500
| groupby path aggregate count() as error_count
| top 5 path by error_count
```

Data flows from left to right and top to bottom. Each command transforms the dataset further.

______________________________________________________________________

## Core command categories and examples

### Type- and format-specific operations

- [**`source`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/sources/source/index.md) – Explicitly define your data source (`logs`, `spans`, `metrics`, or enrichment tables).

  ```dataprime
  source logs
  ```

- [**`limit`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/limit/index.md) – Limit number of rows.

  ```dataprime
  limit 100
  ```

- [**`orderby / sortby`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/orderby_sortby/index.md) – Sort by an expression.

  ```dataprime
  sortby duration desc
  ```

______________________________________________________________________

### Selection & filtering

These commands reduce the dataset by applying filters or keeping only relevant fields.

- [**`filter`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/filter/index.md) – Keep rows where a condition is true.

  ```dataprime
  filter status_code >= 500
  ```

- [**`block`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/block/index.md) – The opposite of `filter`. Remove rows that match a condition.

  ```dataprime
  block method == 'OPTIONS'
  ```

- [**`choose` / `select`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/choose/index.md) – Keep only the specified fields.

  ```dataprime
  choose path, status_code
  ```

- **`distinct`** – Return one row per unique value.

  ```dataprime
  distinct user_id
  ```

- [**`find` / `text`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/find_text/index.md) – Free-text search within a field or across all data.

  ```dataprime
  find 'timeout' in message
  text '503'
  ```

______________________________________________________________________

### Data creation & mutation

Commands that let you generate new fields or modify existing ones.

- [**`create` / `add` / `c`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/create/index.md) – Define a new field based on an expression. This acts similarly to a variable.

  ```dataprime
  create is_error from status_code >= 500
  ```

- [**`replace`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/replace/index.md) – Overwrite a field with a new value.

  ```dataprime
  replace duration_ms with duration / 1_000_000
  ```

- [**`remove`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/remove/index.md) – Remove fields from the document.

  ```dataprime
  remove user_agent
  ```

- [**`convert`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/convert/index.md) – Make type conversions explicit for readability.

  ```dataprime
  convert datatypes status_code:number
  ```

- [**`redact`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/redact/index.md) – Mask sensitive data using regex.

  ```dataprime
  redact message matching /[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}/ to '[EMAIL]'
  ```

______________________________________________________________________

### Aggregation & grouping

These commands reduce many rows into a summary using groupings or statistics.

- [**`aggregate`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/aggregate/index.md) – Run one or more aggregation functions on the entire dataset.

  ```dataprime
  aggregate count() as total_logs
  ```

- [**`groupby`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/groupby/index.md) – Group by a field or expression, and aggregate within those groups.

  ```dataprime
  groupby path aggregate avg(duration) as avg_duration
  ```

- [**`countby`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/countby/index.md) – Shorthand to group and count.

  ```dataprime
  countby status_code into error_counts
  ```

- [**`top` / `bottom`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/top/index.md) – Get top/bottom N records by a sort metric.

  ```dataprime
  top 5 path by count()
  ```

- [**`multigroupby`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/multigroupby/index.md) – Nested groupings (use sparingly for performance).

______________________________________________________________________

### Parsing & extraction

Commands that turn unstructured data into usable fields.

- [**`extract`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/extract/index.md) – Use regex or key-value logic to pull fields out of strings.

  ```dataprime
  extract message into fields using regexp(e=/(?<user>\w+) did (?<action>\w+)/)
  ```

- [**`explode`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/explode/index.md) – Split an array into multiple rows.

  ```dataprime
  explode scopes into scope original preserve
  ```

______________________________________________________________________

### Joins & enrichment

Commands that combine data from other sources or augment with external context.

- [**`enrich`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/enrich/index.md) – Join with a lookup table (e.g., employee info).

  ```dataprime
  enrich user_id into user_info using employees
  ```

- [**`join`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/join/index.md) – Combine two queries based on a condition.

  ```dataprime
  source users | join (source logs | countby userid) on id == userid into logins
  ```

______________________________________________________________________

### Deduplication

Reduce redundancy or volume.

- [**`dedupeby`**](https://coralogix.com/docs/dataprime/language-reference/commands-reference/dedupeby/index.md) – Keep N unique combinations based on expression(s).

  ```dataprime
  dedupeby operationName keep 5
  ```

______________________________________________________________________

## When to use a command vs a function

- **Use a function** when working with individual field values (`ipInSubnet`, `length`, `urlDecode`, etc.).
- **Use a command** when transforming the shape, size, or structure of your dataset.

______________________________________________________________________

## Gotchas

- **Commands must be in the correct order.** For example, `create` before `filter` if you're filtering on a derived field.
- **Type mismatches can break filters or aggregations.** Use `convert` or casts if necessary.
- **Chaining too many heavy operations on large datasets may exceed limits.** Break into smaller queries if needed.
