Skip to content

Overview of Dataspaces and Datasets

Note

This feature is currently limited to early-access customers. Broader availability is planned for a future release.

Dataspaces are the top-level organizational units in Coralogix. They provide a scalable, policy-driven structure for routing, storing, securing, and querying observability data.

Rather than managing all data as a flat list of datasets, dataspaces allow you to group related data by environment, team, or workflow. This unlocks a more flexible and future-proof model for data governance, routing, and analysis.

Why this matters

As your observability footprint grows, organizing data becomes critical. Without structure, you'll end up with unmanageable query scopes, conflicting schemas, and inconsistent retention. Dataspaces and datasets solve this by introducing clean separation of concerns:

  • Dataspaces define boundaries (like environments, teams, or physical locations, etc...).
  • Datasets define content types (like logs, metrics, traces, or enrichment data).

This two-tiered model supports dynamic environments where new services, teams, or data types are constantly introduced.

dataspaces and datasets

How it works

Dataspaces act like databases

Each dataspace groups datasets under a single namespace and enforces shared configuration, routing logic, and access policies. This includes:

  • Routing rules that decide what data goes into what dataset.
  • Configuration templates (like base S3 paths).
  • Lifecycle policies and retention settings.
  • Access control rules for teams or roles.

Datasets act like tables

Datasets are the logical containers for event data inside each dataspace. They can be created automatically (based on routing patterns) or manually (for write-to workflows). You can think of them like “tables” in an SQL schema.

Datasets support:

  • Fine-grained query scopes.
  • Individual retention and access policies.
  • Modular, reusable outputs (e.g., writeto results).

Because datasets are just identifiers, they can take any name—including dot notation like engine.queries. This does not imply a hierarchy—engine.queries and engine.schema_fields are separate, unrelated datasets, for example.

Querying across dataspaces and datasets

You can query any dataset with DataPrime using the format:

source <dataspace>/<dataset>

Examples:

source default/logs
source system/engine.queries

If no dataspace is provided, the default dataspace is assumed:

source logs  // equivalent to source default/logs

This means that if you're only using the default dataspace, your existing queries will continue to work.

Default and system dataspaces

  • Default dataspace: Every account has one. It includes all the standard sources like logs, spans, and enrichments.
  • System dataspace: This is reserved for Coralogix-generated data such as alerts history, audit logs, and notification deliveries.