multigroupby - Perform aggregations on expressions
Description
multigroupby
concatenates the results from two or more queries incorporating groupby
into a single dataset.
Use it for:
-
Efficiency: Data is scanned only once for multiple queries.
-
Synchronization: Results remain coherent, avoiding discrepancies that can arise when running separate queries.
Syntax
multigroupby (<grouping_expression_1> as <alias> [, <grouping_expression_2> as <alias_2>, ...]) [, (<grouping_expression_1> as <alias> [, <grouping_expression_2> as <alias_2>, ...]), ...][aggregate] <aggregation_expression> [as <result_keypath>] [, <aggregation_expression_2> [as <result_keypath_2], ...]
By using the same alias "app" for both groupings, the same semantic meaning is presented as a unified field. In the next example, different aliases are used and the data is combined, but not merged.
Note
calculate
& calc
have now been deprecated. Users should default to aggregate
or agg
.
Usage
Example - Multigroupby with the same alias
Let's say we're interesed in grouping our logs as follows:
-
First by applicationname (app) and then further by subsystemname (ss), providing detailed counts for each combination.
-
Then independently by applicationname, giving total counts of logs for each application regardless of subsystems.
source logs
| multigroupby ($l.applicationname as app, $l.subsystemname as ss),($l.applicationname as app) agg count() | orderby app,ss
[
{
"_count0": 241,
"app": "monitoring24",
"ss": "NO_SUBSYSTEM_NAME"
},
{
"_count0": 231,
"app": "monitoring24",
"ss": "coralogix-opentelemetry-agent"
},
{
"_count0": 15,
"app": "monitoring24",
"ss": "coralogix-opentelemetry-collector"
},
{
"_count0": 487,
"app": "monitoring24",
"ss": null
}
]
The first three rows represent counts for each unique combination of app and ss. For example, there are 241 logs where the application (app) is monitoring24 and the subsystem (ss) is NO_SUBSYSTEM_NAME. Similarly, there are 231 logs for the same application but with the subsystem coralogix-opentelemetry-agent, and so on.
The final row provides a total count of logs for the application monitoring24, aggregating all subsystems. Here, _count0 is 487, the sum of all the detailed counts above. The ss field is null to indicate this is the total for the application as a whole.
Note
By using the same alias "app" for both groupings, the same semantic meaning is presented as a unified field. In the next example, different aliases are used and the data is combined, but not merged.
Example - Multigroupby with different aliases
Now, let’s examine the effects of introducing two different aliases for the queries. In this case, the first grouping is labeled as app1 for applicationname combined with ss, while the second grouping is labeled as app2 for applicationname alone.
source logs | multigroupby ($l.applicationname as app1, $l.subsystemname as ss),($l.applicationname as app2) agg count()
This will result in the following documents:
[
{
"_count0": 241,
"app1": "monitoring24",
"app2": null,
"ss": "coralogix-opentelemetry-agent"
},
{
"_count0": 231,
"app1": "monitoring24",
"app2": null,
"ss": "coralogix-opentelemetry-collector"
},
{
"_count0": 15,
"app1": "monitoring24",
"app2": null,
"ss": "no_subsystem_name"
},
{
"_count0": 487,
"app1": null,
"app2": "monitoring24",
"ss": null
}
]
By introducing separate aliases (app1 and app2), the query maintains the distinction between the two groupings rather than merging the data. Rows where app1 is populated and app2 is null correspond to the detailed grouping by applicationname and subsystemname. For instance, 241 logs are associated with app1 = "monitoring24" and ss = "coralogix-opentelemetry-agent". This follows the first grouping logic.
The row where app2 is populated and app1 is null reflects the total count for the second grouping, where logs are aggregated solely by applicationname. For app2 = "monitoring24", the count is 487, and both app1 and ss are null to indicate this higher-level aggregation.
By using separate aliases (app1 and app2), the query does not merge the data, but rather makes it clear which group each result belongs to. The overall logic remains the same: detailed counts for specific combinations and aggregate counts for the total.
Limitations
Currently multigroupby
doesn’t return duplicate rows for duplicate group sets.
If you run multigroupby
with app and ss, the expected result (if duplicates were allowed) might look like this:
[
{"app": "monitoring24", "ss": "coralogix-opentelemetry-agent", "_count0": 2},
{"app": "monitoring24", "ss": "coralogix-opentelemetry-collector", "_count0": 2}
]
Due to the limitation, multigroupby
will merge these duplicates and return only one row for each unique combination, even if that combination occurs multiple times in the data: