# `join`

## Description

The `join` command merges the current (left) query with the results of a second (right) query. Conceptually, it forms a Cartesian product of left and right rows, then applies join logic based on a condition (`on`) or matching keypaths (`using`) and writes the matching right-side row into a destination keypath (`into`).

**Join types**

- **left** (default): Keep all left rows; attach matching right rows or `null` when no match.
- **inner**: Keep only rows that match on both sides.
- **full**: Keep all rows from both sides; non-matching fields are `null`.
- **cross**: Return the full Cartesian product; no `on`/`using` supported.

**Addressing fields**

- Use `left=>` and `right=>` prefixes inside `on` to disambiguate fields with the same name; omit when a keypath exists only on one side.

**Behavior & caveats**

- Conditions support **equality (`==`)** on keypaths; chain multiple with `&&`.
- One side must be relatively small (< 200MB); reduce size with `filter` or `remove`.
- In left joins, nulls in the join keys prevent matches; use `join full` to include unmatched keys from either side.
- Joins can duplicate rows if multiple matches exist; consider preprocessing (e.g., `distinct`) to avoid unintended multiplicity.

## Syntax

```dataprime
<left_side_query> | join [left/inner/full] (<right_side_query>) on <condition> into <right_side_target>

<left_side_query> | join [left/inner/full] (<right_side_query>) using <join_keypath_1> [, <join_keypath_2>, ...] into <right_side_target>

<left_side_query> | join cross (<right_side_query>) into <right_side_target>
```

## Example

**Use case: Attach login counts to user records (left join by default)**

We have a `users` enrichment table and a set of login events. We want each user row to include the number of logins associated with that user. If a user has no logins, the joined field should be `null` (left-join semantics).

### Example data

```json
{
  "users": [
    { "id": "111", "name": "John" },
    { "id": "222", "name": "Emily" },
    { "id": "333", "name": "Alice" }
  ],
  "logs": [
    { "userid": "111", "timestamp": "2022-01-01T12:00:00Z" },
    { "userid": "111", "timestamp": "2022-01-01T12:30:00Z" },
    { "userid": "222", "timestamp": "2022-01-01T13:00:00Z" },
    { "userid": "222", "timestamp": "2022-01-01T13:00:00Z" },
    { "userid": "222", "timestamp": "2022-01-01T13:00:00Z" }
  ]
}
```

### Example query

```dataprime
source users | join (source logs | countby userid) on id == userid into logins
```

### Example output

```json
{ "id": "111", "name": "John",  "logins": { "userid": "111", "_count": 2 } },
{ "id": "222", "name": "Emily", "logins": { "userid": "222", "_count": 3 } },
{ "id": "333", "name": "Alice", "logins": null }
```

The `join` attaches a single aggregated row per user into `logins`. Users without matching events (e.g., `"333"`) retain their left-side fields with the joined target set to `null`.
