Skip to content

join

Description

The join command merges the current (left) query with the results of a second (right) query. Conceptually, it forms a Cartesian product of left and right rows, then applies join logic based on a condition (on) or matching keypaths (using) and writes the matching right-side row into a destination keypath (into).

Join types

  • left (default): Keep all left rows; attach matching right rows or null when no match.
  • inner: Keep only rows that match on both sides.
  • full: Keep all rows from both sides; non-matching fields are null.
  • cross: Return the full Cartesian product; no on/using supported.

Addressing fields

  • Use left=> and right=> prefixes inside on to disambiguate fields with the same name; omit when a keypath exists only on one side.

Behavior & caveats

  • Conditions support equality (==) on keypaths; chain multiple with &&.
  • One side must be relatively small (< 200MB); reduce size with filter or remove.
  • In left joins, nulls in the join keys prevent matches; use join full to include unmatched keys from either side.
  • Joins can duplicate rows if multiple matches exist; consider preprocessing (e.g., distinct) to avoid unintended multiplicity.

Syntax

<left_side_query> | join [left/inner/full] (<right_side_query>) on <condition> into <right_side_target>

<left_side_query> | join [left/inner/full] (<right_side_query>) using <join_keypath_1> [, <join_keypath_2>, ...] into <right_side_target>

<left_side_query> | join cross (<right_side_query>) into <right_side_target>

Example

Use case: Attach login counts to user records (left join by default)

We have a users enrichment table and a set of login events. We want each user row to include the number of logins associated with that user. If a user has no logins, the joined field should be null (left-join semantics).

Example data

{
  "users": [
    { "id": "111", "name": "John" },
    { "id": "222", "name": "Emily" },
    { "id": "333", "name": "Alice" }
  ],
  "logs": [
    { "userid": "111", "timestamp": "2022-01-01T12:00:00Z" },
    { "userid": "111", "timestamp": "2022-01-01T12:30:00Z" },
    { "userid": "222", "timestamp": "2022-01-01T13:00:00Z" },
    { "userid": "222", "timestamp": "2022-01-01T13:00:00Z" },
    { "userid": "222", "timestamp": "2022-01-01T13:00:00Z" }
  ]
}

Example query

source users | join (source logs | countby userid) on id == userid into logins

Example output

{ "id": "111", "name": "John",  "logins": { "userid": "111", "_count": 2 } },
{ "id": "222", "name": "Emily", "logins": { "userid": "222", "_count": 3 } },
{ "id": "333", "name": "Alice", "logins": null }

The join attaches a single aggregated row per user into logins. Users without matching events (e.g., "333") retain their left-side fields with the joined target set to null.