# `stitch`

## Description

The `stitch` command performs a **horizontal union** of two datasets, combining them side-by-side. It aligns rows from one dataset with rows from another and merges their columns into a single unified dataset.

This is particularly useful for joining datasets that share a logical order but lack a join key.

**Key behaviors:**

- Rows are combined in order (row 1 with row 1, row 2 with row 2, etc.).
- If one dataset contains more rows than the other, unmatched rows include `null` values for missing columns.
- The resulting dataset contains all columns from both sides.

**Difference from `union`:**

- `stitch` merges datasets horizontally (adding columns).
- `union` merges datasets vertically (adding rows).

## Syntax

```dataprime
stitch (<subquery>) into <target-keypath>
```

## Example 1

**Use case: Merge sales and revenue data side-by-side**

Suppose you have two datasets—`sales` and `revenue`—that contain related information about products. You can use `stitch` to combine them horizontally into a single document per product, provided both datasets are ordered consistently.

### Example data

```json
\\ sales
{ "product": "Widget", "sales": 100 },
{ "product": "Gadget", "sales": 200 },
{ "product": "Dashboard", "sales": 150 }

\\ revenue
{ "product": "Widget", "revenue": 5000 },
{ "product": "Gadget", "revenue": 8000 },
{ "product": "Dashboard", "revenue": 6000 }
```

### Example query

```dataprime
source sales
| orderby product
| stitch (source revenue | orderby product) into combined_data
```

### Example output

```json
{ "product": "Widget", "sales": 100, "combined_data": { "product": "Widget", "revenue": 5000 } },
{ "product": "Gadget", "sales": 200, "combined_data": { "product": "Gadget", "revenue": 8000 } },
{ "product": "Dashboard", "sales": 150, "combined_data": { "product": "Dashboard", "revenue": 6000 } }
```

The `stitch` command aligns rows based on their order after sorting and attaches the right dataset (`revenue`) into the target keypath (`combined_data`) within the left dataset (`sales`).

Note

If the datasets have differing row counts, missing values from the shorter dataset are replaced with `null`:

## Example 2

If the datasets have unequal rows, the stitch command fills missing values with null.

Consider the following datasets:

sales dataset: (3 rows)

```json
{ "product": "Widget", "sales": 100 }
{ "product": "Gadget", "sales": 200 }
{ "product": "Dashboard", "sales": 150 }
```

revenue dataset: (2 rows)

```json
{ "product": "Widget", "revenue": 5000 }
{ "product": "Gadget", "revenue": 8000 }
```

### Example query

```dataprime
source sales | orderby product
| stitch (source revenue | orderby product) into combined_data
```

### Example output

```json
{ "product": "Widget", "sales": 100, "combined_data": { "product": "Widget", "revenue": 5000 } }
{ "product": "Gadget", "sales": 200, "combined_data": { "product": "Gadget", "revenue": 8000 } }
{ "product": "Dashboard", "sales": 150, "combined_data": { "product": "Dashboard", "revenue": null } }
```

Note

Rows must represent the same entities and appear in the same order for meaningful stitching. If row alignment differs, results may not match as expected.
