Skip to content

Processing and routing

When data enters Coralogix, it goes through a structured lifecycle: received from shippers or agents, transformed with DataPrime rules, routed based on attributes like region or team, and directed into the appropriate dataspace and dataset. If a dataset doesn't already exist, it's created automatically and inherits configuration from the parent dataspace.

High-level flow

flowchart LR
    A["Ingress"] --> B["Pre-processing"]
    B --> C["Routing"]
    C --> D["Dataset creation"]
    D --> E["Configuration inheritance"]
    E --> F["Final storage & query"]
  1. Ingress

    Data enters the platform through a shipper or agent. Customers can pre-define a targetDatabase (dataspace) and targetDataset via the shipper config.

  2. Pre-processing

    Coralogix applies DataPrime transformation rules:

    For example, fields can be removed or recalculated before finalizing the data structure:

    remove derived_metric
    | replace raw_value with normalized_value
    | create derived_metric from quantity * 232
    
  3. Routing

    A set of conditions (e.g., region, team, environment) determine where data goes:

    For example, different regions or teams in the data can determine the target dataspace or dataset.

    <region == 'us2'>       ->      [targetDataspace = bu1, targetDataset = logs-us]
    <team == 'neptune'>     ->      [targetDataspace = planet, targetDataset = gassy]
    <team == 'venus'>       ->      [targetDataspace = planet, targetDataset = rocky]
    

    Routing is fully data-driven and can include dynamic elements:

    <region>                 ->     [targetDataspace = bu2, targetDataset = logs-{$l.applicationname}]
    
  4. Dataset creation

    If a dataset does not already exist, it will be created automatically under the target dataspace.

  5. Configuration inheritance

    The dataset inherits configuration from its dataspace, including:

    • Storage prefix (e.g., s3://bucket/my-dataspace/logs-regionX)
    • Retention and archive rules
    • Access control policies
    • Metadata enrichment
  6. Final storage & query

    Once routed and processed, the data is written to object storage and made available for querying.

data ingestion


Example dataset structure

default/
  └── logs
  └── spans

business-unit1/
  └── logs

business-unit2/
  └── logs-cx510
  └── logs-euprod2
  └── logs-production
  └── ...
  └── <datasets created dynamically as data arrives>

security/
  └── ...

Handling quota and duplication

  • Duplicating data across datasets (e.g., routing the same event to multiple targets) will count against your quota.
  • You can monitor dataset-level usage in Dataset Management.
  • Dataset quotas can be enforced per team, space, or workload.
  • The data usage page shows detailed breakdowns to help you understand where and how your data is being consumed.