# `collect`

## Description

Returns an array of values collected from an expression for each group.

- Supports optional deduplication (`distinct`), null filtering (`ignoreNulls`), and element limits (`limit`).
- Values are aggregated into an array, preserving order of processing unless constrained by `distinct` or `limit`.

## Syntax

```dataprime
collect(expression: T, distinct: bool?, limit: number?, ignoreNulls: bool?): array<T>
```

## Arguments

| Name        | Type    | Required  | Description                                                                    |
| ----------- | ------- | --------- | ------------------------------------------------------------------------------ |
| expression  | T       | **true**  | The value to collect into the array                                            |
| distinct    | boolean | **false** | If `true`, only unique values are collected. Literal only. Defaults to `false` |
| limit       | number  | **false** | Maximum number of values to collect. Must be a positive literal                |
| ignoreNulls | boolean | **false** | If `true`, `null` values are skipped. Literal only. Defaults to `false`        |

## Example

**Use case: Collect distinct container names per application**

Suppose you want to gather all unique container names for each application from your Kubernetes logs.

### Example data

```json
{ "applicationname": "checkout-service", "kubernetes": { "container_name": "c1" } },
{ "applicationname": "checkout-service", "kubernetes": { "container_name": "c1" } },
{ "applicationname": "checkout-service", "kubernetes": { "container_name": "c2" } }
```

### Example query

```dataprime
groupby $l.applicationname aggregate collect(kubernetes.container_name, distinct = true) as containers
```

### Example output

| applicationname  | containers     |
| ---------------- | -------------- |
| checkout-service | [ "c1", "c2" ] |
