Overview of Dataspaces and Datasets
Note
This feature is currently limited to early-access customers. Broader availability is planned for a future release.
Dataspaces are the top-level organizational units in Coralogix. They provide a scalable, policy-driven structure for routing, storing, securing, and querying observability data.
Rather than managing all data as a flat list of datasets, dataspaces allow you to group related data by environment, team, or workflow. This unlocks a more flexible and future-proof model for data governance, routing, and analysis.
Why this matters
As your observability footprint grows, organizing data becomes critical. Without structure, you'll end up with unmanageable query scopes, conflicting schemas, and inconsistent retention. Dataspaces and datasets solve this by introducing clean separation of concerns:
- Dataspaces define boundaries (like environments, teams, or physical locations, etc...).
- Datasets define content types (like logs, metrics, traces, or enrichment data).
This two-tiered model supports dynamic environments where new services, teams, or data types are constantly introduced.
How it works
Dataspaces act like databases
Each dataspace groups datasets under a single namespace and enforces shared configuration, routing logic, and access policies. This includes:
- Routing rules that decide what data goes into what dataset.
- Configuration templates (like base S3 paths).
- Lifecycle policies and retention settings.
- Access control rules for teams or roles.
Datasets act like tables
Datasets are the logical containers for event data inside each dataspace. They can be created automatically (based on routing patterns) or manually (for write-to workflows). You can think of them like “tables” in an SQL schema.
Datasets support:
- Fine-grained query scopes.
- Individual retention and access policies.
- Modular, reusable outputs (e.g.,
writeto
results).
Because datasets are just identifiers, they can take any name—including dot notation like engine.queries
. This does not imply a hierarchy—engine.queries
and engine.schema_fields
are separate, unrelated datasets, for example.
Querying across dataspaces and datasets
You can query any dataset with DataPrime using the format:
Examples:
If no dataspace is provided, the default
dataspace is assumed:
This means that if you're only using the default dataspace, your existing queries will continue to work.
Default and system dataspaces
- Default dataspace: Every account has one. It includes all the standard sources like
logs
,spans
, andenrichments
. - System dataspace: This is reserved for Coralogix-generated data such as alerts history, audit logs, and notification deliveries.