approx_count_distinct - Count unique values approximately for performance
The approx_count_distinct
function will count a given document if:
- The document contains a unique, non-null value for a specified field.
This function is an approximation of the result from distinct_count
and is designed for high performance scenarios where the exact figure isn't important.
NOTE: approx_count_distinct
is an aggregation function, so should be used in conjunction with a grouping keyword, like groupby
.
Syntax
Arguments
Name | Type | Required | Description |
---|---|---|---|
expression | any | true | The non-null value to be counted distinctly |
Example - Checking for high users
In this example, we wish to understand how many active users there have been in a given timeframe. We don't need the exact value, since we're primarily interested if it's over a certain figure. We can provide the username
field to produce a count of the unique users.
We can then flag whether this is higher than our threshold of users:
The field is_high_user_count
is now true
is the approximate total number of users for a given applicationname
is over 10000, and false
otherwise.