# Using DataPrime to enrich and reshape data

## Goal

By the end of this guide you should be able to enrich documents using lookup tables, extract structured values from strings, parse key-value pairs, and explode arrays into separate documents.

## Why it matters

Data in logs and traces is often messy, inconsistent, or incomplete. You may need to add metadata, normalize fields across schemas, parse semi-structured text, or reshape arrays into flat rows for easier analysis. DataPrime lets you do all of this *at query time*, without needing to preprocess or re-index.

These transformations are essential for debugging, auditing, and building clean, meaningful dashboards—even when your logs aren’t clean.

______________________________________________________________________

### Enrich documents with lookup metadata

#### Description

The [`enrich`](https://coralogix.com/docs/dataprime/language-reference/commands-reference/enrich/index.md) command allows you to decorate your logs with metadata from an external lookup table. This is useful for adding human-readable context such as names, departments, or team ownership—based on fields like `userid`, `ip`, or `cluster_id`.

#### Syntax

```dataprime
enrich <lookup_value> into <target_field> using <lookup_table>
```

- `lookup_value`: The field in the document used as a lookup key.
- `target_field`: The new key where enriched data will be stored.
- `lookup_table`: The name of the custom enrichment table.

#### Example

**Sample data**

```json
{
  "userid": "111"
}
```

**Lookup table: `user_lookup_table`**

| ID  | Name  | Department |
| --- | ----- | ---------- |
| 111 | John  | Finance    |
| 222 | Emily | IT         |

**Query**

```dataprime
enrich userid into user_info using user_lookup_table
```

**Result**

```json
{
  "userid": "111",
  "user_info": {
    "ID": "111",
    "Name": "John",
    "Department": "Finance"
  }
}
```

This query appends the relevant row from the lookup table as an object under `user_info`, creating a dynamic join on read.

______________________________________________________________________

### Extract structured data from strings (`extract` + `regexp`)

#### Description

The [`extract`](https://coralogix.com/docs/dataprime/language-reference/commands-reference/extract/index.md) command paired with the `regexp` extraction strategy allows you to pull structured values from text strings. It's ideal for turning loosely formatted logs into something queryable.

#### Syntax

```dataprime
extract <source_field> into <target_field> using regexp(e=/<named_capture_group>/)
```

#### Example

**Sample data**

```json
{
  "message": "user Chris has logged in"
}
```

**Query**

```dataprime
extract message into parsed using regexp(e=/user (?<username>.*) has logged in/)
```

**Result**

```json
{
  "message": "user Chris has logged in",
  "parsed": {
    "username": "Chris"
  }
}
```

Now you can filter, count, or visualize by `parsed.username`, rather than relying on full-text search.

______________________________________________________________________

### Parse key-value strings into objects (`extract` + `kv`)

#### Description

The `kv` strategy for [`extract`](https://coralogix.com/docs/dataprime/language-reference/commands-reference/extract/index.md) is ideal for parsing structured fields that follow key-value formatting (e.g., logfmt, URL query strings). It creates an object with separate keys for each parsed item.

#### Syntax

```dataprime
extract <source_field> into <target_object> using kv(pair_delimiter='&', key_delimiter='=')
```

Note

`kv` is only one of several extractor functions. Choose the one that best serves your use case.

#### Example

**Sample data**

```json
{
  "query_string": "user=chris&env=prod"
}
```

**Query**

```dataprime
extract query_string into query_params using kv(pair_delimiter='&', key_delimiter='=')
```

**Result**

```json
{
  "query_string": "user=chris&env=prod",
  "query_params": {
    "user": "chris",
    "env": "prod"
  }
}
```

You can now access `query_params.user` and `query_params.env` directly in filters, visualizations, or enrichments.

______________________________________________________________________

### Explode arrays into multiple documents (`explode`)

#### Description

The [`explode`](https://coralogix.com/docs/dataprime/language-reference/commands-reference/explode/index.md) command transforms a document with an array field into multiple documents, one per array element. This makes it easier to count, filter, or group by individual values inside arrays.

#### Syntax

```dataprime
explode <array_field> into <item_field> original [preserve|discard]
```

- `original preserve`: Retains all original fields in each new document.
- `original discard`: Only includes the exploded value in each new document.

#### Example

**Sample data**

```json
{
  "userid": "1",
  "scopes": ["read", "write"]
}
```

**Query**

```dataprime
explode scopes into scope original preserve
```

**Result**

```json
{ "userid": "1", "scope": "read", "scopes": ["read", "write"] }
{ "userid": "1", "scope": "write", "scopes": ["read", "write"] }
```

Each document now contains a single `scope` value, while keeping the original context (`userid`, `scopes`).

______________________________________________________________________

## Common pitfalls or gotchas

- `enrich` only works if your lookup key is a string—cast it if needed.
- `extract` using `regexp` will return `null` if the pattern doesn't match.
- `kv` extraction assumes consistent formatting—watch for missing delimiters or malformed strings.
- `explode` overwrites destination fields if names collide—rename carefully.
