[Live Webinar] Next-Level O11y: Why Every DevOps Team Needs a RUM Strategy Register today!

Why You Need to Closely Monitor Your Exchange Servers

  • Peter Hammond
  • April 13, 2021
Share article
monitor microsoft exchange

Log monitoring your on-prem and hybrid cloud infrastructure has always been important. With an ever-growing rise in cyber attacks, zero-day exploits, and insider threats, keeping track of your infrastructure has a renewed level of significance. 

Microsoft Exchange is one of the most prominent enterprise systems in use today, with both cloud and on-prem iterations. In this article, we’re going to examine the significance of the recent Microsoft Exchange Zero-Day vulnerability, what it means for users around the world, and what log monitoring brings to these stressful situations.

What happened with the recent Microsoft breach?

In mid-March, Microsoft announced that they had detected numerous zero-day exploits which were being used in highly targeted attacks on Microsoft Exchange on-prem. This is significant for several reasons.

First, any zero-day vulnerability is highly dangerous owing to its lack of coverage by existing security tooling built into the product, application, or service. Users of products or platforms with existing zero-day exploits must then rely on other security measures, such as firewalls or monitoring, to identify and combat the vulnerability. 

Second, the wide usage of Microsoft Exchange on-prem makes the impact of these vulnerabilities far more impactful than if it were a lesser-known platform. Enterprises, governments, and major financial institutions all rely on Microsoft Exchange for calendaring, mail, and collaborations. 

Third, while the exploits were announced in mid-March, they may have been used maliciously from as early as January. Volexity reports that they had evidence of one of the exploits being used against two of its customers, exfiltrating large amounts of data to unknown, malicious IP addresses. 

Overall, the ongoing impact of these zero-day exploits is still being measured. Microsoft has released a range of patches, but given the nature of the exploits, victims, and time elapsed, it’s difficult to tell what the fallout will be. 

What should you do when you suspect a security breach?

Dealing with security breaches is always going to be highly stressful. Your Intellectual Property, customer data, and performance are some of the many factors that will cost you directly in dollars lost and indirectly in brand reputation and relationships. 

Various laws implemented across multiple different countries now mandate how and when a company must communicate that they have had a security or data breach. Therefore, reputational damage is likely and must be an additional consideration.

What’s the protocol for dealing with a security breach?

Companies will have differing procedures in place for dealing with a suspected security breach. In 2020, 30 data security experts were surveyed in what they felt companies’ priorities ought to be following a breach.

We’ll examine the most common of these below (in no particular order) and discuss their relevance to the recent Microsoft Exchange breach. 

1. Understand what happened

While this may seem obvious, getting a full understanding of the breach, your company’s exposure, and potential ramifications, is necessary. Not only will this help feed into your mitigation and disaster recovery execution, but it will allow you to plug any gaps. This also includes understanding whether your systems are still compromised and remain at risk. Your logging, monitoring, and observability systems are going to be key here.

Needless to say, this is easier said than done. In the context of the Microsoft Exchange vulnerability, big industry players like Microsoft and other security companies were the first to uncover and publicize this news. However, the impact on your own business is going to be different in every case.

2. Mitigate the problem

Again, another obvious one. However, this is intrinsically linked to the above priority. Mitigation will depend on what you have uncovered, whether you have a fix, and what has been affected.

You can’t mitigate a security issue without having a full understanding of the problem. This will involve isolating where the breach occurred, changing access management policies, passwords, encryption keys – anything, in effect, which may have been compromised. 

Again, referring to your observability platform is going to be crucial. A single pane of glass view giving you context around what systems were affected, and when, will be critical in understanding whether a malicious actor left a trojan behind.

In the case of the Microsoft Exchange server breaches, Microsoft was relatively slow on releasing patches and fixes for the zero-day exploit, taking nearly 3 months to make resources publicly available. If you’re able to identify something before your vendor tells you, you can act without needing to see the bad news online. 

3. Capture, analyze, and visualize data

Data is key to understanding any security breach. A common theme throughout all the security experts’ recommendations is to capture 100% of data available, even if you think it might not be relevant. 

Having data from application performance, system health, network traffic and more will enable system administrators, engineers, and investigators to pinpoint issues. This is critical in building an understanding of the breach, its impact, and informing your strategy moving forward. 

With Microsoft Exchange, this presents a few problems which we will discuss further in the article. Naturally, you need a robust monitoring solution set up, a good log repository (ideally a cost-effective solution given the likely volume of data), and you need to know what you’re looking for in all of the data you are collecting.

The Problem with Monitoring Microsoft Exchange Server

As covered in the expert opinions above, data capture and analysis are important to every part of your data breach strategy. For effective data capture, your monitoring solution needs to be robust, properly implemented, and display metrics relating to all of your inter-connected systems. 

We’re talking about observability. However, this is an issue of two parts. First, how do you collect, hold, and filter all of the relevant data? Second, how do you understand these logs and metrics quickly for real-time and post-incident response?

Using Windows Event Logs before, during, or after a breach

If you’re using Office365-hosted Exchange, you will have a limited entitlement to Windows Event Logs. Given it is Microsoft-provided, you might think that it’s a natural companion of observability on Microsoft Exchange. There are, however, a few aspects of the logging solution which mean it’s not always going to be the most helpful when looking at security issues.

Microsoft will levy hefty charges if you want to mirror your Windows Event Logs. With cost often being a driver for organizational IT strategy, it’s likely that you won’t have sufficient coverage to create a truly observable system.

Windows Event Log Analysis is also fairly complex. When trying to understand the status of your system mid-breach, you may be spending unnecessary time looking through all of the different log types, without getting down to the nuts and bolts of what’s going on.

Failed services are often a sign of malicious activity within Windows Event Logs. But with four subsets of service failure logs, each with significant amounts of data, your sysadmins and engineers will be left scratching their heads trying to work out what has gone wrong. 

This issue of complexity also bubbles over into your post-incident response. Whether your company has opted for an external team of forensic investigators, or your own team is carrying out the work, you’ll need simplicity and transparency. Time spent sifting through millions of rows of data is time not spent either fixing the problem or continuing with other duties. 

Using Coralogix before, during, or after a breach

Coralogix has a direct integration with Windows Event Viewer that ingests all of the log information and displays it alongside your other system health and log data, in a single pane of glass. 

Before a breach

Coralogix integrates with numerous security tools, can be integrated with firewalls, and even has its own security traffic analyzer. Overlaying data from these systems with Windows Event Viewer data puts your organization in a strong position to spot any potential weaknesses, either in your code or in your systems, at the earliest possible opportunity.

If that isn’t enough, Coralogix’s platform also uses machine learning to spot anomalies in the data it aggregates. This is significant for two reasons.

First, Coralogix knows what the baseline is for your system, be it a network, application, or firewall. With this baseline, Coralogix can identify and alert any departure from the norm far more effectively than any SOC team.

Second, moving away from rules-based alerts allows you to be more flexible in your approach to potential malicious activity. Machine learning-driven insights will adapt to the ebb and flow of your system, minimizing false positives.

During a breach

If you suspect that your infrastructure has been compromised, maybe because Coralogix has alerted you of a potential anomaly, there are several tools in your arsenal to help.

Coralogix offers geo-enrichment of IP data it ingests. Naturally, your firewall should be filtering traffic from IPs it understands to be malicious, but you’re relying on your firewall provider to do so. With geo-enrichment, you can visualize where your IP hits are coming from. If they are different from your usual customer base, you might want to take note.

With the Alerts API, Coralogix allows you to fully customize what you’re being pinged about, how often, and how it’s dealt with. This allows your security team to remain laser-focused on a potential ongoing issue, while making sure any relevant data is pushed straight to the right people. This aspect of observability can empower organizations to react effectively to security threats

After a breach 

Should the worst happen, and you identify a breach after the fact, then Coralogix is well-positioned to help.

If you need to recall large amounts of data for analysis natively within Coralogix, then you can do so instantly. To better understand this data to pinpoint a breach or its impact, Coralogix has many visualization options which can allow you to plot data in Kibana, Grafana, or natively within Coralogix. 

Post-incident, you’ll be relying on your log data to tell you what has happened. With a SaaS solution like Coralogix, your logs and metrics are isolated from your main system, which means attackers can’t edit log files to hide what they’ve done.

Summary

To wrap up, data breaches are scary things. They are even more intimidating when it’s a zero-day exploit affecting something as pervasive as Microsoft Exchange server.

As we’ve seen, and as we know, data is key in preventing cyber-attacks and mitigating their impact. As an unfortunate fact of the modern world, it’s your duty to ensure that you can react appropriately. A large part of that is making sure that you have the right tools for the job. Trusted by enterprises the world over, Coralogix is a great partner for monitoring and protecting your Microsoft Exchange estate.

Where Modern Observability
and Financial Savvy Meet.

Live Webinar
Next-Level O11y: Why Every DevOps Team Needs a RUM Strategy
April 30th at 12pm ET | 6pm CET
Save my Seat