Monitoring-as-Code for Scaling Observability
Monitoring as Code, also called Observability as Code, is a method of automating the configuration of observability tools using code. It includes tasks like cloud resource…
Whether you are just starting your observability journey or already are an expert, our courses will help advance your knowledge and practical skills.
Expert insight, best practices and information on everything related to Observability issues, trends and solutions.
Explore our guides on a broad range of observability related topics.
With all the data available today, it’s easy to think that tracking website performance is as simple as installing a few tools and waiting for alerts to appear. Clearly, the challenge to site performance monitoring isn’t a lack of data. The real challenge is understanding what data to look at, what data to ignore, and how to interpret the data you have. Here’s how.
There are five common mistakes that administrators make when tracking website performance. Avoiding these mistakes does not require purchasing any new tools. In order to efficiently avoid these mistakes, you simply need to learn how to work more intelligently with the data visualization tools you already have.
Mistake #1 – Not Considering the End-User Experience
When it comes to things like response times, it’s easy to rely on “objective” speed tests and response logs to determine performance, but that can be a mistake. The speed that is often more important is the perceived speed performance, or how fast an end-user thinks the site is.
Back in 1993, usability expert, Jakob Nielsen pointed out that end users think a 100 millisecond response time is instantaneous and that a one-second response time is acceptable. These are the kinds of measurements you must make, to accommodate end-users, and not rely solely on “backend” data.
Mistake # 2 – Letting Common Errors Obscure New Errors
Common errors, by definition, make up the bulk of the weblog data. These errors occur so often that it’s easy to ignore them. The problem occurs when among a large number of common errors hides a new, unique error that gets missed because of all of the “noise” around it.
It is important that your website performance monitoring tool has the ability to quickly identify these “needles in a haystack.” Short of that, it is essential that the long trail of common errors be mined for new errors. This can give valuable clues to forthcoming performance problems.
Mistake # 3 – Ignoring Old Logs
It’s easy for IT folks to not want to go back and investigate older logs, say greater than one year, but that can be a mistake. Being able to go back and evaluate large quantities of older data, using simplifying techniques such as log patterns or other graphical means, can give great insight. It can also help highlight a slow-developing problem that could easily get missed by strictly focusing on the most recent results.
When establishing site performance monitoring policies, make sure they specify enough historical data to give you the results you need. Assuming you use a site management tool that can accommodate large data sets, it’s better to err on the side of too much data than not enough.
Mistake # 4 – Setting the Wrong Thresholds
Website performance tracking is only as good as the alert thresholds set. Set them too high and you miss a critical event. Set them too low and you get a bunch of false positives which eventually lead to ignoring them.
The reality is that some servers or services being monitored are more prone to instantaneous or temporary performance degradation. This is enough to trigger an alert, but not sufficient to bring it down, and could be due to traffic load or a routing issue. Whatever the reason, the only way around this is trial and error. Make sure you spend the time to optimize the threshold levels for all your systems to give you the actionable information you need.
Mistake #5 – Only Paying Attention to Critical Alerts
The sheer volume of site performance data makes it natural to want to ignore it all until a critical alert appears, but that can be a mistake. Waiting for a critical alert may result in a system outage, which could then impact your ability to meet your Service Level Agreements, and possibly even lead to liquidated damages.
Rarely do systems fail without giving some clues first. It is imperative that you mine your existing data to develop a predictive ability over time. You also need to understand which site performance metrics are worth paying attention to before it costs you any money.
There you have it. Five mistakes to avoid when monitoring website performance. Not more data, but more intelligent use of existing data.
Monitoring as Code, also called Observability as Code, is a method of automating the configuration of observability tools using code. It includes tasks like cloud resource…
With constantly decreasing user attention spans, ensuring a seamless user experience has become a priority for all digital businesses. Users who encounter minimal application disruptions and…
Artificial intelligence (AI) has emerged as a transformative force, empowering businesses and software engineers to scale and push the boundaries of what was once thought impossible….