# Create user-defined datasets in the default dataspace

Note

This feature is available for early-access customers. To request access and confirm your organization meets the feature criteria, contact your account representative or Support.

Create custom datasets under the `default/` dataspace to isolate log streams, prevent schema collisions, and apply dataset-level access control. Once created, route logs to specific datasets using [TCO Optimizer](https://coralogix.com/docs/user-guides/account-management/tco-optimizer/index.md) policies, and query them directly from DataPrime or Explore.

## How it works

User-defined datasets live under the `default/` dataspace alongside the predefined `logs` and `spans` datasets. Data routed to a user-defined dataset:

- Is stored in your configured archive, inherited from Setup Archive.
- Is not indexed in OpenSearch, so **High** and **Block** TCO priorities are unavailable.
- Counts toward your [daily unit quota](https://coralogix.com/docs/user-guides/account-management/payment-and-billing/data-usage/index.md) at the priority of the routing policy.
- Has its usage tracked per dataset and per dataspace in **Data Usage**.

## Streaming vs summary datasets

A user-defined dataset is created the same way regardless of its purpose. How it is **populated** determines its type:

- **Streaming dataset** — populated by a [TCO Optimizer](https://coralogix.com/docs/user-guides/account-management/tco-optimizer/index.md) policy that routes ingested logs into the dataset in real time. Use this when you want a live, dataset-scoped feed of incoming data.
- **Summary dataset** — populated by [Background queries v2](https://coralogix.com/docs/user-guides/dataengine/background_queries_v2/index.md) writing query results into the dataset. Use this for pre-aggregated or pre-filtered slices that downstream queries and dashboards can read quickly without rerunning heavy scans.

The same dataset can serve as both, but a single ingestion source is recommended per dataset to keep schema evolution predictable. Entity type and partitioning are determined by the first write, regardless of source — see the [Background queries v2 entity-type and partitioning notes](https://coralogix.com/docs/user-guides/dataengine/background_queries_v2/#entity-type) for details.

## Create a dataset

1. Navigate to **Data Flow**, then **Dataset Management**.
1. Select the **Default** tab.
1. Select **Create Dataset**.
1. In **Dataset name**, enter a unique name. Names must be unique within the `default/` dataspace.
1. In **Description**, enter an optional description.
1. In **Bucket name**, the storage bucket is inherited from your archive configuration. Optionally, enter a different bucket name.
1. Under **Access Policy**, optionally select **Enabled Access Policy** to [restrict access](https://coralogix.com/docs/user-guides/data-layer/dataset-management/access-control/index.md).
1. Select **Create**.

The dataset appears in the **Default** tab. To query it immediately, hover over the dataset row and select to open it in Explore.

## Route data to a dataset

Route logs to a user-defined dataset by setting it as the target in a [TCO Optimizer](https://coralogix.com/docs/user-guides/account-management/tco-optimizer/index.md) policy. When creating or editing a policy, select your dataset from the **Target Dataset** dropdown.

TCO policies are evaluated in order. The first matching policy determines the routing target. If no policy matches, logs route to `default/logs`. To revert a policy to the standard logs destination, select **Reset to default** in the **Target Dataset** section.

## Query a dataset

Query a user-defined dataset in DataPrime using its full dataspace path:

```dataprime
source default/<dataset-name>
```

From **Explore**, select your dataset from the data source selector to search and visualize it interactively.

## Limitations

- Physical location is inherited from your Setup Archive and cannot be customized per dataset in this release.
- Only `logs` entity type data can be **routed via TCO** (streaming datasets). Spans cannot be streamed into a user-defined dataset. Summary datasets written by [Background queries v2](https://coralogix.com/docs/user-guides/dataengine/background_queries_v2/index.md) support `logs`, `spans`, `jsonData`, and other entity types.
- **High** and **Block** TCO priority are available for `default/logs` only.
- Dataset names must be unique within `default/` and cannot be renamed after creation.
- Deleting a dataset removes its configuration and stops routing. Data already written to storage is not deleted — retention is controlled by your S3 lifecycle policies.

## Permissions

All permissions are in the `dataengine` permission group, under the `team-datasets` resource.

| Action               | Description                      |
| -------------------- | -------------------------------- |
| Read                 | Query data from the dataset      |
| Append Data          | Write new records to the dataset |
| Overwrite Data       | Replace data in the dataset      |
| Read Access Policy   | View the dataset access policy   |
| Manage Access Policy | Edit the dataset access policy   |
