Skip to content

Dataspaces and datasets

Dataspaces and datasets provide a two-tiered model for organizing, routing, and securing observability data in Coralogix.

  • Dataspaces define organizational boundaries — such as environments, business units, teams, or regions.
  • Datasets define logical groupings of content — such as logs, traces, metrics, or enrichment data.

dataspaces and datasets

Ready to get started?

Create a user-defined dataset, view and manage system datasets, or query existing datasets in Explore.


Dataspaces

A dataspace is a logical container for one or more datasets. It acts as a control layer for:

  • Routing logic
  • Storage structure
  • Retention policies
  • Access control
  • Schema enforcement

Think of dataspaces like databases. Each dataspace groups datasets under a single namespace and enforces shared configuration. For example, your organization might define dataspaces for frontend, backend, and security:

frontend/
  └── ui.events
  └── user.interactions

backend/
  └── service.requests
  └── system.traces

Configuration inheritance

When a dataset is created inside a dataspace, it automatically inherits the dataspace's configuration:

  • S3 storage paths
  • Retention rules
  • Access policies
  • Metadata enrichment

For example, if a dataspace defines the S3 path s3://my-bucket/my_prefix, new datasets inside that dataspace automatically write to:

s3://my-bucket/my_prefix/dataset1
s3://my-bucket/my_prefix/dataset2

This inheritance is dynamic — no manual setup is needed when new datasets appear.

Types of dataspaces

TypeDescription
defaultThe main user-facing dataspace. Contains datasets like logs, spans, and any user-defined datasets.
systemA Coralogix-managed dataspace for internal datasets such as alert history, audit events, and schema metadata.
user-definedCustom dataspaces created by users to segment data by team, region, environment, or use case. Coming soon.

Datasets

A dataset is a scoped collection of related data within a dataspace. Think of datasets like tables in a database. Each dataset contains a specific stream of observability data (e.g., logs, traces, alerts) that inherits configuration from its parent dataspace.

Datasets are created:

  • Automatically — via routing logic
  • Manually — through the UI or writeTo queries
  • Dynamically — based on values like $d.region, $l.applicationname, etc.

Note

Datasets currently work only with archived data.

Because datasets are just identifiers, they can take any name, including dot notation like engine.queries. This does not imply a hierarchy — engine.queries and engine.schema_fields are separate, unrelated datasets.

Key capabilities

CapabilityDescription
Dynamic creationDatasets are created on-the-fly based on routing rules or labels like $l.applicationname. No manual setup required.
Scoped performanceSegmented datasets reduce schema collisions and improve query speed by narrowing the search space.
Granular controlApply retention, access, routing, and enrichment policies at the dataset level.
ReusabilityWrite query results into datasets and retrieve them later for dashboards, joins, or long-term analytics.

Writing to and reading from a dataset

// Write query results to a dataset
source logs
| filter status_code >= 500
| writeTo default/high_errors
// Reuse it later
source default/high_errors
| groupby path agg count()

Note

Duplicated data created by queries will count towards your quota.

Dataset schemas

Each dataset has an associated schema, influenced by its pillar (logs, spans, etc.) and entity type (e.g., alerts, browserLogs, cpuProfiles).
PillarEntity typeExample schema
logsalerts{ alert_name, severity, status, triggered_at }
logsbrowserLogs{ user_agent, page_url, timestamp }
logstext{ text: "..." }
spansspansOpenTelemetry-formatted span objects
metricsmetrics{ __name__, value, labels... }
binarysessionRecordingsMetadata + link to binary
binaryfilesFile metadata (e.g., name, size, uploaded_by)

Schema docs for common datasets:

Enabling and disabling datasets

Datasets, especially system datasets, must be manually enabled. Once enabled:

  • All users can query them
  • They count toward your daily quota
  • Previously generated data remains accessible, even if later disabled

Disabling a dataset stops its ingestion — not its storage.

Managing datasets

Manage your datasets from the UI by navigating to Data Flow, then Dataset Management. Here, you can view all active datasets, enable/disable system datasets, apply configuration rules, view schema definitions, and inspect sample documents.


Query syntax

Query any dataset with DataPrime using the source command:

source <dataspace>/<dataset>

Examples:

source default/logs
source system/engine.queries
source frontend/spans

If no dataspace is provided, the default dataspace is assumed:

source logs  // equivalent to source default/logs

If you're only using the default dataspace, your existing queries will continue to work.


System datasets

Coralogix includes several read-only, auto-generated datasets in the system dataspace:
DatasetDescription
system/aaa.audit_eventsStores audit logs for compliance and access monitoring.
system/alerts.historyRecords alert evaluation and trigger metadata.
system/engine.queriesHistorical record of user queries for introspection and optimization.
system/engine.schema_fieldsTracks field-level schema evolution over time.
system/labs.limit_violationsRecords each time a configured limit is exceeded.
system/notification.deliveriesLogs Notification Center delivery events. Alerts record delivery failures, while Cases record both successful and failed deliveries.
system/notification.requestsCaptures each incoming notification request metadata.

See System dataspace for more information.