Dataspaces and datasets
Dataspaces and datasets provide a two-tiered model for organizing, routing, and securing observability data in Coralogix.
- Dataspaces define organizational boundaries — such as environments, business units, teams, or regions.
- Datasets define logical groupings of content — such as logs, traces, metrics, or enrichment data.
Ready to get started?
Create a user-defined dataset, view and manage system datasets, or query existing datasets in Explore.
Dataspaces
A dataspace is a logical container for one or more datasets. It acts as a control layer for:
- Routing logic
- Storage structure
- Retention policies
- Access control
- Schema enforcement
Think of dataspaces like databases. Each dataspace groups datasets under a single namespace and enforces shared configuration. For example, your organization might define dataspaces for frontend, backend, and security:
Configuration inheritance
When a dataset is created inside a dataspace, it automatically inherits the dataspace's configuration:
- S3 storage paths
- Retention rules
- Access policies
- Metadata enrichment
For example, if a dataspace defines the S3 path s3://my-bucket/my_prefix, new datasets inside that dataspace automatically write to:
This inheritance is dynamic — no manual setup is needed when new datasets appear.
Types of dataspaces
| Type | Description |
|---|---|
| default | The main user-facing dataspace. Contains datasets like logs, spans, and any user-defined datasets. |
| system | A Coralogix-managed dataspace for internal datasets such as alert history, audit events, and schema metadata. |
| user-defined | Custom dataspaces created by users to segment data by team, region, environment, or use case. Coming soon. |
Datasets
A dataset is a scoped collection of related data within a dataspace. Think of datasets like tables in a database. Each dataset contains a specific stream of observability data (e.g., logs, traces, alerts) that inherits configuration from its parent dataspace.
Datasets are created:
- Automatically — via routing logic
- Manually — through the UI or
writeToqueries - Dynamically — based on values like
$d.region,$l.applicationname, etc.
Note
Datasets currently work only with archived data.
Because datasets are just identifiers, they can take any name, including dot notation like engine.queries. This does not imply a hierarchy — engine.queries and engine.schema_fields are separate, unrelated datasets.
Key capabilities
| Capability | Description |
|---|---|
| Dynamic creation | Datasets are created on-the-fly based on routing rules or labels like $l.applicationname. No manual setup required. |
| Scoped performance | Segmented datasets reduce schema collisions and improve query speed by narrowing the search space. |
| Granular control | Apply retention, access, routing, and enrichment policies at the dataset level. |
| Reusability | Write query results into datasets and retrieve them later for dashboards, joins, or long-term analytics. |
Writing to and reading from a dataset
// Write query results to a dataset
source logs
| filter status_code >= 500
| writeTo default/high_errors
Note
Duplicated data created by queries will count towards your quota.
Dataset schemas
Each dataset has an associated schema, influenced by its pillar (logs, spans, etc.) and entity type (e.g., alerts, browserLogs, cpuProfiles).
| Pillar | Entity type | Example schema |
|---|---|---|
| logs | alerts | { alert_name, severity, status, triggered_at } |
| logs | browserLogs | { user_agent, page_url, timestamp } |
| logs | text | { text: "..." } |
| spans | spans | OpenTelemetry-formatted span objects |
| metrics | metrics | { __name__, value, labels... } |
| binary | sessionRecordings | Metadata + link to binary |
| binary | files | File metadata (e.g., name, size, uploaded_by) |
Schema docs for common datasets:
Enabling and disabling datasets
Datasets, especially system datasets, must be manually enabled. Once enabled:
- All users can query them
- They count toward your daily quota
- Previously generated data remains accessible, even if later disabled
Disabling a dataset stops its ingestion — not its storage.
Managing datasets
Manage your datasets from the UI by navigating to Data Flow, then Dataset Management. Here, you can view all active datasets, enable/disable system datasets, apply configuration rules, view schema definitions, and inspect sample documents.
Query syntax
Query any dataset with DataPrime using the source command:
Examples:
If no dataspace is provided, the default dataspace is assumed:
If you're only using the default dataspace, your existing queries will continue to work.
System datasets
Coralogix includes several read-only, auto-generated datasets in the system dataspace:
| Dataset | Description |
|---|---|
| system/aaa.audit_events | Stores audit logs for compliance and access monitoring. |
| system/alerts.history | Records alert evaluation and trigger metadata. |
| system/engine.queries | Historical record of user queries for introspection and optimization. |
| system/engine.schema_fields | Tracks field-level schema evolution over time. |
| system/labs.limit_violations | Records each time a configured limit is exceeded. |
| system/notification.deliveries | Logs Notification Center delivery events. Alerts record delivery failures, while Cases record both successful and failed deliveries. |
| system/notification.requests | Captures each incoming notification request metadata. |
See System dataspace for more information.
