multigroupby - Perform aggregations on expressions

Description

multigroupbyconcatenates the results from two or more queries incorporating groupby into a single dataset.

Use it for:

Efficiency: Data is scanned only once for multiple queries.
Synchronization: Results remain coherent, avoiding discrepancies that can arise when running separate queries.

Syntax

multigroupby  (<grouping_expression_1> as <alias> [, <grouping_expression_2> as <alias_2>, ...])  [, (<grouping_expression_1> as <alias> [, <grouping_expression_2> as <alias_2>, ...]), ...][aggregate]  <aggregation_expression> [as <result_keypath>] [, <aggregation_expression_2> [as <result_keypath_2], ...]

By using the same alias "app" for both groupings, the same semantic meaning is presented as a unified field. In the next example, different aliases are used and the data is combined, but not merged.

Note

calculate & calc have now been deprecated. Users should default to aggregate or agg.

Usage

Example - Multigroupby with the same alias

Let's say we're interesed in grouping our logs as follows:

First by applicationname (app) and then further by subsystemname (ss), providing detailed counts for each combination.
Then independently by applicationname, giving total counts of logs for each application regardless of subsystems.

source logs 
| multigroupby ($l.applicationname as app, $l.subsystemname as ss),($l.applicationname as app) agg count() | orderby app,ss

This will result in the following documents:

[
    {
        "_count0": 241,
        "app": "monitoring24",
        "ss": "NO_SUBSYSTEM_NAME"
    },
    {
        "_count0": 231,
        "app": "monitoring24",
        "ss": "coralogix-opentelemetry-agent"
    },
    {
        "_count0": 15,
        "app": "monitoring24",
        "ss": "coralogix-opentelemetry-collector"
    },
    {
        "_count0": 487,
        "app": "monitoring24",
        "ss": null
    }
]

The first three rows represent counts for each unique combination of app and ss. For example, there are 241 logs where the application (app) is monitoring24 and the subsystem (ss) is NO_SUBSYSTEM_NAME. Similarly, there are 231 logs for the same application but with the subsystem coralogix-opentelemetry-agent, and so on.

The final row provides a total count of logs for the application monitoring24, aggregating all subsystems. Here, _count0 is 487, the sum of all the detailed counts above. The ss field is null to indicate this is the total for the application as a whole.

Note

By using the same alias "app" for both groupings, the same semantic meaning is presented as a unified field. In the next example, different aliases are used and the data is combined, but not merged.

Example - Multigroupby with different aliases

Now, let’s examine the effects of introducing two different aliases for the queries. In this case, the first grouping is labeled as app1 for applicationname combined with ss, while the second grouping is labeled as app2 for applicationname alone.

source logs | multigroupby ($l.applicationname as app1, $l.subsystemname as ss),($l.applicationname as app2) agg count()

This will result in the following documents:

[
    {
        "_count0": 241,
        "app1": "monitoring24",
        "app2": null,
        "ss": "coralogix-opentelemetry-agent"
    },
    {
        "_count0": 231,
        "app1": "monitoring24",
        "app2": null,
        "ss": "coralogix-opentelemetry-collector"
    },
    {
        "_count0": 15,
        "app1": "monitoring24",
        "app2": null,
        "ss": "no_subsystem_name"
    },
    {
        "_count0": 487,
        "app1": null,
        "app2": "monitoring24",
        "ss": null
    }
]

By introducing separate aliases (app1 and app2), the query maintains the distinction between the two groupings rather than merging the data. Rows where app1 is populated and app2 is null correspond to the detailed grouping by applicationname and subsystemname. For instance, 241 logs are associated with app1 = "monitoring24" and ss = "coralogix-opentelemetry-agent". This follows the first grouping logic.

The row where app2 is populated and app1 is null reflects the total count for the second grouping, where logs are aggregated solely by applicationname. For app2 = "monitoring24", the count is 487, and both app1 and ss are null to indicate this higher-level aggregation.

By using separate aliases (app1 and app2), the query does not merge the data, but rather makes it clear which group each result belongs to. The overall logic remains the same: detailed counts for specific combinations and aggregate counts for the total.

Limitations

Currently multigroupby doesn’t return duplicate rows for duplicate group sets.

If you run multigroupby with app and ss, the expected result (if duplicates were allowed) might look like this:

[
  {"app": "monitoring24", "ss": "coralogix-opentelemetry-agent", "_count0": 2},
  {"app": "monitoring24", "ss": "coralogix-opentelemetry-collector", "_count0": 2}
]

Due to the limitation, multigroupby will merge these duplicates and return only one row for each unique combination, even if that combination occurs multiple times in the data:

[
  {"app": "monitoring24", "ss": "coralogix-opentelemetry-agent", "_count0": 2}
]