Create Time Window SLOs
Note
SLOs and SLO alerts are in Beta mode. Features may change, and some functionality may be limited.
Time window SLOs evaluate performance over fixed intervals. This SLO type is best suited for use cases that prioritize reliability across consistent, minute-by-minute time windows.
This method ensures that quality remains steady throughout the entire SLO period by evaluating performance in each discrete time window. Any time window that fails to meet the defined criteria is treated as a failure, thereby contributing to the depletion of the error budget.
Overview
Time window SLOs measure reliability by assessing performance of a service or other object in fixed windows of 1 or 5 minutes. Instead of accumulating the requests over the SLO time frame (e.g. 7,14,21 or 28 days), the requests of each time window are evaluated to assess whether they are good or bad, based on whether they meet a predefined success threshold (e.g., 99% of requests within a 1-min time window are successful).
The Service Level Indicator (SLI) is calculated as:
Time window SLOs are especially effective for latency because they evaluate performance over fixed intervals of 1 or 5 minutes, marking each window as good or bad based on whether it meets a defined threshold. This approach avoids the need to aggregate individual request data across long timeframes and eliminates reliance on histogram buckets like le
, which require manual setup, introduce high cardinality, and limit flexibility. By using raw metrics directly, users can define latency thresholds more naturally, resulting in simpler configuration, more accurate evaluations, and lower storage and compute costs. It also better captures real-world behavior by reflecting the impact of sustained latency spikes rather than isolated slow requests.
SLO components
When defining a time window SLO, you must specify:
- Conditions of the time-windows: a measurement interval (1 or 5 minutes), the metric you wish to query within each SLO time window, an operator (e.g.,
<
,>=
,!=
), and a success threshold (e.g., p95 latency ≤ 1.5s) - Time frame (e.g., 7 days) and a target percentage for the overall SLO (e.g., 95%)
SLO setup
To create a new Service Level Objective (SLO), go to APM > SLO Center and click Create SLO. Define the SLO details.
Field | Description |
---|---|
Name | The unique name identifying your SLO. |
Owner | The user responsible for maintaining and reviewing the SLO. Ownership defaults to the creater, but can be reassigned to any Coralogix team member. |
Entity labels | Metadata used for filtering and grouping SLOs. |
Description | (Optional) Additional context or purpose of the SLO to clarify its scope and intent for other users. |
Select the SLO type
Choose Time window as the SLO type. This model evaluates how many time windows (e.g., 1 or 5 minutes) meet a defined success condition within the SLO time frame.
Define the SLI
To define your uptime condition, specify:
- A time window (1m or 5m)
- A PromQL query that returns a value to evaluate (e.g., p95 latency, error rate)
- An operator (e.g.,
<
,>=
,!=
) - A threshold the value must satisfy
Note
The metrics used in the query can either be sent directly to Coralogix or derived from logs and spans using Event2Metrics.
Set the operator and threshold
Use the dropdowns to define your operator (e.g., <
, >=
) and threshold value.
In this example:
- Operator:
>=
- Threshold:
80
This means a window is considered good if average CPU usage meets or exceeds 80. Otherwise, it’s bad.
Set the SLO target and time frame
Define:
- Time frame (e.g., last 7 days)
- SLO target (e.g., 95% of time windows must be good)
In the example shown, the system tracks whether at least 95% of all 1-minute windows over the past 7 days were within the threshold.
Real-time preview
You’ll see:
- State – Current percentage of good windows (e.g., 98%)
- Status – Whether the target is met (e.g., OK)
- Budget – Remaining percentage of allowed downtime (e.g., 82% error budget remains)
- Visualization - A chart with the measured metric over the recent SLO time frame
This preview allows you to assess how the defined threshold performs against real metric data over the SLO time frame.
Example: Time window SLO with p95 latency threshold
In the example below, an SLO is configured with 5-minute time windows. The uptime condition is defined as: p95 latency must be less than or equal to 1.5 seconds. Only one time window in the evaluation period exceeds this threshold.
The total monitored period spans 12 hours (720 minutes), resulting in 144 time windows (720 ÷ 5). With one window violating the latency condition, the service is considered up for 715 minutes and down for 5 minutes.
The resulting uptime is:
Underlying PromQL query
This query calculates the p95 latency over 5-minute windows using Prometheus histograms. Each time window is evaluated against the threshold: if the p95 value is greater than 1.5 seconds, that time window is marked as downtime.
Additional query examples
You can define time window SLOs using any PromQL expression that returns a single numeric value per time window. Here are two additional examples based on average and total latency.
Average latency per time window
This query calculates the average latency by dividing the sum of durations by the count of requests. You might use this with a threshold such as < 300ms
.
sum(increase(duration_ms_sum{service_name="my-service"}))
/
sum(increase(duration_ms_count{service_name="my-service"}))
- Time window: 1m or 5m
- Threshold:
< 300
(milliseconds) - Use case: Tracks average response time of a specific service in each time window.
Total latency grouped by dimension
This query evaluates total latency in each time window and groups results by one or more labels (e.g., route
, instance
, service_name
).
- Time window: 1m or 5m
- Threshold:
< 1m
(milliseconds or seconds, depending on metric scale) - Use case: Ensures total latency remains within bounds across grouped entities such as routes or services.
Next steps
Once the configuration is complete:
- Click Save to store the SLO.
- Click Save & create alert to immediately configure an alert based on this SLO.
Additional resources
Find out how to safely use recording rule–based metrics in your SLO creation with this guide.