## TL;DR

Group `system/engine.schema_fields` by dataset, count distinct types with `distinct_count(type)`, and rank datasets by `type_count` to reveal which have the most diverse schemas.

## Problem / Use case

You want to understand how many unique data types exist across your datasets and see which ones contain the greatest type variety. This helps identify inconsistent schemas or areas where normalization is needed.

## Query

```dataprime
source system/engine.schema_fields
| groupby dataset
    aggregate
        distinct_count(type) as type_count,
        any_value(type) as sample_type
| orderby type_count desc
```

## Expected output

| dataset              | type_count | sample_type |
| -------------------- | ---------- | ----------- |
| logs                 | 13         | object      |
| aaa.audit_events     | 10         | string      |
| engine.schema_fields | 8          | null        |
| engine.queries       | 7          | number      |
| spans                | 6          | string      |
| labs.limitViolations | 5          | number      |

## Variations

- Count schema fields by **source system** instead of dataset.
- Include **field count per type** to measure schema complexity within datasets.
