Coralogix is not just another monitoring or observability platform. We’re using our unique Streama technology to analyze data without needing to index it so teams can…
Alerting has been a fundamental part of operations strategy for the past decade. An entire industry is built around delivering valuable, actionable alerts to engineers and customers as quickly as possible. We will explore what’s missing from your alerts and how Coralogix Flow Alerts solve a fundamental problem in the observability industry.
When engineers build their alerts, they focus on making them as useful as possible, but how do we define useful? While this is a complicated question, we can break the utility of an alert into a few easy points:
For many engineers, achieving these three qualities is a never-ending battle. Engineers are constantly chasing after the smallest, valuable set of alerts we can possibly have to minimize noise and maximize uptime.
However, one key feature is missing from almost every alerting provider, and it goes right to the heart of observability in 2022.
If we host our own solution, perhaps with an ELK stack and Prometheus, as is so common in the industry, we are left with some natural alerting options. Alertmanager integrates nicely with Prometheus, and Kibana comes with its own alerting functionality, so you have everything you need, right? Not quite.
Your observability data has been siloed into two specific datastores: Elasticsearch and Prometheus. As soon as you do this, you introduce an architectural complication.
Despite how simple this sounds, this is something that is not supported by the vast majority of SaaS observability providers or open-source tooling. Metrics, logs, and traces are treated as separate pillars, filtering down into our alerting strategies.
It isn’t clear how this came about, but you only need to look at the troubleshooting practices of any engineer to work out that it’s suboptimal. As soon as a metric alert fires, the engineer looks at the logs to verify. As soon as a log alert fires, the engineer looks at the metrics to better understand. It’s clear that all of this data is used for the same purpose, but we silo it off into separate storage solutions and, in doing so, make our life more difficult.
The answer is twofold. Firstly, we need to bring all of our observability data into a single place, to build a single pane of glass for our system. Aside from alerting, this makes monitoring and general querying more straightforward. It removes the complex learning curve associated with many open-source tools, which speeds up the time it takes for engineers to become familiar with their chosen approach to observability. However, getting data into one place isn’t enough. Your chosen platform needs to support holistic alerting. And there is only one provider on the market – Coralogix.
There are many SaaS observability providers out there that will consume your logs, metrics, and traces, but none of them can tie all of this data together into a single, cohesive alert that completely describes an outage, making use of your logs, metrics, and traces in the same alert.
Flow alerts enable you to view your entire system globally without being constrained to a single data type. This brings some key benefits that directly address the great limitations in alerting: