Splunk Indexer Vulnerability: What You Need to Know

A new vulnerability, CVE-2021-342 has been discovered in the Splunk indexer component, which is a commonly utilized part of the Splunk Enterprise suite. We’re going to explain the affected components, the severity of the vulnerability, mitigations you can put in place, and long-term considerations you may wish to make when using Splunk.

What is the affected component?

The Splunk indexer is responsible for sorting and indexing data that the Splunk forwarder sends to it. It is a central place where much of your observability data will flow as part of your Splunk setup. The forwarder and the indexer communicate with one another using the Splunk 2 Splunk (S2S) protocol. 

The vulnerability itself lies within the validation that is inherent within the S2S protocol. The S2S protocol allows for a field type called field_enum_dynamic. What this field allows you to do is send a numerical value in your payload, and have it automatically mapped to some corresponding text value. This is useful because your machines can talk in status codes, but those codes can be dynamically mapped to human-readable text.

What is the impact of the vulnerability?

This field type, field_enum_dynamic is not validated properly, which means that a specially crafted value can enable a malicious attacker to read memory that they shouldn’t be able to. This is called an Out of Bounds (OOB) read vulnerability, and essentially means an attacker can go out of their intended boundaries. 

An alternative attack might be to intentionally trigger a page fault, which would shut down the Splunk service. Doing this repeatedly would result in a Denial of Service (DoS) attack. For these reasons, this CVE is considered high severity with a score of 7.5. 

What mitigations can you put in place?

Splunk has released patches for the impacted components. The new versions that are not vulnerable to this attack are 7.3.9, 8.0.9, 8.1.3, and 8.2.0. Priority number one should be aiming to get to these versions, where this attack has been totally mitigated.

If upgrading is not something you can do, then you may wish to look into implementing SSL for your Splunk forwarders or simply enable forwarder access control, using a token. These steps will make it more difficult for a malicious attacker to send specially crafted packets to your indexer because they’ll need to compromise the SSL certificate or the token first.

What do we need to think about in the long term?

OOB vulnerabilities can be particularly nasty. Not just because of the possibility of leaking information in Splunk, but if your attacker has some specialist knowledge of your system, they can expand to memory that is being used by completely different applications. For example, if an attacker knows that your SSL certificates are loaded into a different area of memory. They might even be able to implement a full memory dump, one packet at a time. 

This means that the danger of this Splunk vulnerability isn’t just in the Splunk data you may leak, but in the much more sensitive information that the attacker may be able to access. This means that you immediately become dependent on the security of the underlying infrastructure.

How secure is your on-premise infrastructure?

It is tempting to think that your on-premise data centers are a bastion of security. You have complete control over their configuration, so you’re able to finely tune them. In reality, this may not be the case. It’s very easy to forget to enable Address Space Layout Randomization (ASLR) or Data Execution Prevention (DEP) on your instances, both of which would make these types of vulnerabilities more difficult to exploit. These are just two of a number of switches that you need to understand, to build and deploy secure hardware in your data center.

A cloud provider like AWS will automatically enable these types of features for you, so that your virtual machine is immediately more secure. If this type of attack occurred in a cloud-based environment, it would be much more difficult to exploit adjacent applications in memory, because cloud environments often come with a lot of very sensible security defaults to prevent processes from reading beyond their allotted memory. This is part of the reason why 61% of security researchers say that a breach in a cloud environment is usually equally or less dangerous than the same breach in an on-premise environment. 

Would a SaaS observability tool be impacted by this?

Splunk indexers operate within the tenant infrastructure, which means that a vulnerability with the Splunk component is an inherent vulnerability in the user’s software. This decreases the control that the user has because they aren’t the ones producing patches. 

Coralogix is a central, multi-tenant, full-stack observability platform that provides a layer of abstraction between the internal workings of your system and your observability data, preventing vulnerabilities like CVE-2021-342 from being chained with other attacks. 

Guide: Smarter AWS Traffic Mirroring for Stronger Cloud Security

So, you’ve installed Coralogix’s STA and you would like to start analyzing your traffic and getting valuable insights but you’re not sure that you’re mirroring enough traffic or wondering if you might be mirroring too much data and could be getting more for less.

In order to be able to detect everything, you have to capture everything and in order to be able to investigate security issues thoroughly, you need to capture every network packet. 

More often than not, the data once labeled irrelevant and thrown away is found to be the missing piece in the puzzle when slicing and dicing the logs in an attempt to find a malicious attacker or the source of an information leak.

However, as ideal as this might be, in reality, capturing every packet from every workstation and every server in every branch office is usually impractical and too expensive, especially for larger organizations. Just like in any other field of security, there is no real right or wrong here, it’s more a matter of whether or not it is worth the trade-off in particular cases.

There are several strategies that can be taken to minimize the overall cost of the AWS traffic monitoring solution and still get acceptable results. Here are some of the most commonly used strategies:

1. Mirror by Resource/Information Importance

  1. Guidelines: After mapping out the most critical assets for the organization from a business perspective, configure the mirroring to mirror only traffic to and from the most critical servers and services to be mirrored and analysed. For example, a bank will probably include all SWIFT related servers, for a software company it will probably include all traffic from and to their code repository, release location, etc.
  2. Rationale: The rationale behind this strategy is that mirroring the most critical infrastructures will still provide the ability to detect and investigate security issues that can harm the organization the most and will save money by not mirroring the entire infrastructure.
  3. Pros: By following this strategy, you will improve the visibility around the organization’s critical assets and should be able to detect issues related to your organization’s “crown jewels” (if alerts are properly set) and to investigate such issues.
  4. Cons: Since this strategy won’t mirror the traffic from non-crown jewels environments, you will probably fail to pinpoint the exact (or even approximate) path the attacker took in order to attack the organization’s “crown jewels”.
  5. Tips: If your organization uses a jump-box to connect to the crown jewels servers and environments, either configure the logs of that jump-box server to be as verbose as possible and store them on Coralogix with a long retention or mirror the traffic to the jumpbox server.

2. Mirror by Resource/Information Risk

  1. Guidelines: After mapping out all the paths and services through which the most critical data of the organization is being transferred or manipulated, configure the mirroring to mirror only traffic to and from those services and routes. The main difference between this strategy and the one mentioned above is that it is focused on sensitive data rather than critical services as defined by the organization.
  2. Rationale: The rationale behind this strategy is that mirroring all the servers and services that may handle critical information will still provide the ability to detect and investigate security issues that can harm the organization the most and will save money by not mirroring the entire infrastructure.
  3. Pros: You will improve the visibility around the critical data across services and environments and you should be able to detect, by configuring the relevant alerts, attempts to modify or otherwise interfere with handling and transferring the organization’s sensitive data
  4. Cons: Since this strategy won’t mirror traffic from endpoints connecting to the services and paths used for transmission and manipulation of sensitive data, it might be difficult or even impossible to detect the identity of the attacker and the exact or even approximate path taken by the attacker.
  5. Tips: Collecting logs from firewalls and WAFs that control the connections from and to the Internet and sending the logs to Coralogix can help a great deal in creating valuable alerts and by correlating them with the logs from the STA can help identify the attacker (to some extent) and his/her chosen MO (Modus Operandi).

3. Mirror by Junction Points

  1. Guidelines: Mirror the data that passes through the critical “junction points” such as WAFs, NLBs or services that most of the communication to the organization and its services goes through.
  2. Rationale: The idea behind this strategy is that in many organizations there are several “junction points” such as WAFs, NLBs, or services that most of the communication to the organization and its services goes through. Mirroring this traffic can cover large areas of the organization’s infrastructure by mirroring just a handful of ENIs.
  3. Pros: You will save money on mirroring sessions and avoid mirroring some of the data while still keeping a lot of the relevant information.
  4. Cons: Since some of the data (e.g. lateral connections between servers and services in the infrastructure) doesn’t necessarily traverse the mirrored junction points, it won’t be mirrored which will make it harder and sometimes even impossible to get enough information on the attack or even to be able to accurately detect it.
  5. Tips: Currently, AWS cannot mirror an NLB directly but it is possible and easy to mirror the server(s) that are configured as target(s) for that NLB. Also, you can increase the logs’ verbosity on the non-monitored environments and services and forward them to Coralogix to compensate for the loss in traffic information.

4. Mirror by Most Common Access Paths

  1. Guidelines: Mirror traffic from every server is based on the expected and allowed set of network protocols that are most likely to be used to access it.
  2. Rationale: The idea behind this strategy is that servers that expose a certain service are more likely to be attacked via that same service. For example, an HTTP/S server is more likely to be attacked via HTTP/S than via other ports (at least at the beginning of the attack). Therefore, it makes some sense to mirror the traffic from each server based on the expected traffic to it.
  3. Pros: You will be able to save money by mirroring just part of the traffic that arrived or was sent from the organization’s servers. You will be able to detect, by configuring the relevant alerts, some of the indications of an attack on your servers.
  4. Cons: Since you mirror only the expected traffic ports, you won’t see unexpected traffic that is being sent or received to/from the server which can be of great value for a forensic investigation.
  5. Tips: Depending on your exact infrastructure and the systems and services in use, it might be possible to cover some of the missing information by increasing the services’ log verbosity and forwarding them to Coralogix.

5. Mirror Some of Each

  1. Guidelines: Randomly select a few instances of each role, region or subnet and mirror their traffic to the STA.
  2. Rationale: The idea behind this strategy is that it would be reasonable to assume that the attacker would not know which instances are mirrored and which are not, and also, many tools that are used by hackers are generic and will try to propagate through the network without checking if the instance is mirrored or not, therefore, if the attacker tries to move laterally in the network (manually or automatically), or to scan for vulnerable servers and services, it is very likely that the attacker will hit at least one of the mirrored instances (depending on the percentage of instances you have selected in each network region) and if alerts were properly configured, it will raise an alert.
  3. Pros: A high likelihood of detecting security issues throughout your infrastructure, especially the more generic types of malware and malicious activities.
  4. Cons: Since this strategy will only increase the chances of detecting an issue, it is still possible that you will “run out of luck” and the attacker will penetrate the machines that were not mirrored. Also, when it comes to investigations it might be very difficult or even impossible to create a complete “story” based on the partial data that will be gathered.
  5. Tips: Since this strategy is based on a random selection of instances, increasing the operating system and auditing logs as well as other services logs and forwarding them to Coralogix for monitoring and analysis can sometimes help in completing the picture in such cases.

In addition to every strategy, you’ll choose or develop, we would also recommend that you mirror the following. These will probably cost you near nothing but can be of great value when you’ll need to investigate an issue or detect security issues (manually and automatically):

  1. All DNS traffic – It is usually the tiniest traffic in terms of bytes/sec and packets/sec but can compensate for most black spots that will result in such trade-offs.
  2. Mirror traffic that should never happen – Suppose you have a publicly accessible HTTP server that is populated with new content only by scp from another server. In this case, mirroring should be done on the FTP access to the server, since that FTP is one of the most common methods to push new content to HTTP servers, mirroring FTP traffic to this server and defining an alert on such an issue will reveal attempts to replace the HTTP contents even before they have succeeded. This is just one example, there are many possible examples (ssh or nfs to Windows servers, RDP, SMB, NetBIOS and LDAP connections to Linux servers) you probably can come up with more based on your particular environment. The idea here is that since an attacker doesn’t have any knowledge of the organization’s infrastructure, the attacker will have to first scan hosts to see which operating systems are running and which services they are hosting, for example by trying to connect via SMB (a protocol mostly used by Windows computers) and if there is a response, the attacker would assume that it is Windows. Of course, the same applies to Linux.

Cloud Access

In cloud infrastructures, instances and even the mirroring configuration are accessible via the Internet and therefore theoretically allows an attacker to find out whether an instance is mirrored and to act accordingly. Because of this, it is even more important to make sure that access to the cloud management console is properly secured and monitored.

Exciting New Features of Coralogix STA

We at Coralogix, believe that cloud security is not a “nice-to-have” feature – something that only large organizations can benefit from or are entitled to have. We believe it’s a basic need that should be solved for organizations of any shape and size. This is why we built the Coralogix Security Traffic Analyzer (STA) tool for packet sniffing and automated analysis. Today we’re announcing several new features to our security product you’ll find interesting.

1. Automatic AWS VPC Traffic Mirroring Configuration Manager

One of the great things about AWS is that everything can scale up and down as much as needed to keep costs at a minimum while not losing any important data. Now we brought this power to the VPC Traffic Mirroring configuration. You can read all about it here.

2. Spot/On-demand Choice

The new installation process of the STA now allows you to choose whether you’d like to run the STA as a spot instance of a spot fleet (for example for testing purposes) or as an on-demand instance. Now the choice is absolutely yours.

3. Configurable Size

Now you can choose the size of the machine that will be used for the STA.  The instance types that are going to be used based on the selected size are listed below:
[table id=40 /]

4. Automated configuration sync to S3

During installation, you can set an S3 bucket for the configuration of the STA, if the bucket is empty, the STA will automatically copy its config files to that bucket, if the bucket contains the STA config files and they have been modified (either manually by you or by a script…) the STA will automatically pull the new configuration and apply it. This configuration includes the following files:
[table id=41 /]
To learn more about how to modify these files see here.

5. Automated upload of .pcap files to S3

During installation, the user can set an S3 bucket that will be used by the STA to upload compressed pcap files of all the traffic that was observed by the STA. The user can then set any lifecycle hook on that bucket for automated cleanup of old pcap files. This bucket will also contain executable files extracted directly from the traffic. These pcap files can be used for many purposes, including forensic investigations, alert tuneups, deeper investigations of applications and services issues, and more.

6. Monitoring

The new STA contains a built-in Prometheus node-exporter that listens on the third network interface on the default port.

7. Domain letter frequency analysis

Many cyber attacks nowadays are using command and control servers, and kill-switches for their malicious code. These usually use machine-generated domain names. We added a new capability to the STA to automatically calculate a score for each domain, parent domain virtual host, certificate CN, etc. based on the frequency of letter combinations that are expected to be rare and letter combinations that are expected to be frequent. This score can be used to detect machine-generated domains in certificates, common names, and DNS requests, and several other locations where the domain name can be found.

8. “Baby Domains”

Employees and even more so, servers that are accessing domains that are “young” in the sense that they were registered only very recently are often good indications of malicious activity. The new version of the STA automatically pulls a list of domains with their creation date and adds the creation date to every domain detected in DNS requests, virtual hosts, and many other fields that contain a domain name. In addition, the new version of the STA contains a special dashboard for displaying such “baby domains” that were accessed by monitored servers and clients.

9. NIST Enrichment

The STA will automatically attempt to detect the software and version on the client and server machines that took part in the communications seen by the STA. Based on that information, the STA will attempt to detect CVEs (Common Vulnerability Enumeration) numbers associated with that software by MITRE and will alert you if a new type of software is found or if a new vulnerable software was detected.

10. Default Alerts

We added a default set of more than 60 alerts that will be added to your account after the installation of the STA. These alerts will help you to get started with the STA and dramatically improve your organization’s security posture. You can read more about these alerts here.

11. Default Dashboards

We added a default set of more than 60 different dashboards to help you slice and dice the data to find your needle in the huge haystack

That’s it for now. We have lots of new exciting features just waiting to be released in the next versions so stay tuned.

How Biden’s Executive Order on Improving Cybersecurity Will Impact Your Systems

President Joe Biden recently signed an executive order which made adhering to cybersecurity standards a legal requirement for federal departments and agencies.

The move was not a surprise. It comes after a string of high-profile cyber-attacks and data breaches in 2020 and 2021. The frequency and scale of these events exposed a clear culture of lax cybersecurity practices throughout both the public and private sectors.

President Biden’s order brings into law many principles which have long been espoused by cybersecurity advocacy groups, such as the National Institute of Standards and Technology (NIST)’s Five Functions. It is the latest legislation in a trend towards greater transparency and regulation of technology in the US.

The Executive Order on Improving the Nation’s Cybersecurity puts in place safeguards that have until now been lacking or non-existent. While regulations are only legally binding for public organizations (and their suppliers), many see it as a foreshadowing of further regulation and scrutiny of cybersecurity in the private sector.

Despite not being directly impacted, a memo was sent out from the White House to corporate leaders urging them to act as though regulations are legally binding. It’s clear that businesses must take notice of Biden’s drive to safeguard US national infrastructure against cyber threats.

What’s in the Executive Order on Improving the Nation’s Cyber Security

The order spans almost sections and covers a range of issues, but there are several which stand out as likely to become relevant to the private sector.

Chief of these is a requirement for IT and OT providers who supply government and public bodies to store and curate data in accordance with new regulations. They must also report any potential incidents and cooperate with any government operation to combat a cyber threat.

The order also implies future changes for secure software development, with the private sector encouraged to develop standards and display labels confirming their products’ security and adherence to regulatory standards. Some also theorize that government-only mandates for two-factor authentication, encryption, and cloud security, could include private organizations soon.

The key takeaway for businesses is that, whether it’s next year or a decade from now, it’s likely they’ll be required by law to maintain secure systems. If your security, logging, or systems observability are lacking, Biden’s executive order could be your last warning to get them up-to-scratch before regulations become legally binding.

How does this affect my systems?

Many enterprises are acting as though the executive order is legally binding. This is in no small part due to the White House’s memo urging businesses to do so. A common view is that it won’t be long before regulations outlined in the EO are expanded beyond government.

For suppliers to the government, any laws passed following Biden’s order immediately apply. This even extends to IT/OT providers whose own customers include the government as their customers. In short, if any part of your system(s) handles government data, you’ll be legally required to secure them according to the regulatory standards.

Data logging and storage regulations

Logging and storage is a key EO focal point. Compliant businesses will have system logs properly collected, maintained, and ready for access should they be required as part of an intelligence or security investigation.

This move is to enhance federal abilities to investigate and remediate threats, and covers both internal network logs and logging data from 3rd party connections. Logs will have to, by law, be available immediately on request. Fortunately, many end-to-end logging platforms make compliance both intuitive and cost-effective.

System visibility requirements

Under the EO, businesses will be required to share system logs and monitoring data when requested. While there aren’t currently legal mandates outlining which data this includes, a thorough and holistic view of your systems will be required during any investigation.

With the order itself stating that “recommendations on requirements for logging events and retaining other relevant data” are soon to come, and shall include “the types of logs to be maintained, the time periods to retain the logs and other relevant data, the time periods for agencies to enable recommended logging and security requirements, and how to protect logs”, it’s clear that future cybersecurity legislation won’t be vague. Compliance requirements, wherever they’re applied, will be specific.

In the near future, businesses found to have critical system visibility blind spots could face significant legal ramifications. Especially if said blind spots become an exploited vulnerability in a national cybercrime or cybersecurity incident.

The legal onus will soon be on businesses to ensure their systems don’t contain invisible back doors into the wider national infrastructure. Your observability platform must provide full system visibility.

Secure services

The EO also included suggestions for software and service providers to create a framework for advertising security compliance as a marketable selling point.

While this mainly serves to create a competitive drive to develop secure software, it’s also to encourage businesses to be scrupulous about 3rd parties and software platforms they engage.

In the not-too-distant future, businesses utilizing non-compliant or insecure software or services will likely face legal consequences. Again, the ramifications will be greater should these insecure components be found to have enabled a successful cyberattack. Moving forward, businesses need to show 3rd party services and software they deploy unprecedented levels of scrutiny. 

Security should always be the primary concern. While this should have been the case anyway, the legal framework set out by Biden’s executive order means that investing in only the most secure 3rd party tools and platforms could soon be a compliance requirement. How does this affect my systems?

Many enterprises are acting as though the executive order is legally binding. This is in no small part due to the White House’s memo urging businesses to do so. A common view is that it won’t be long before regulations outlined in the EO are expanded beyond government.

For suppliers to the government, any laws passed following Biden’s order immediately apply. This even extends to IT/OT providers whose own customers include the government as their customers. In short, if any part of your system(s) handles government data, you’ll be legally required to secure them according to the regulatory standards.

Data logging and storage regulations

Logging and storage is a key EO focal point. Compliant businesses will have system logs properly collected, maintained, and ready for access should they be required as part of an intelligence or security investigation.

This move is to enhance federal abilities to investigate and remediate threats, and covers both internal network logs and logging data from 3rd party connections. Logs will have to, by law, be available immediately on request. Fortunately, many end-to-end logging platforms make compliance both intuitive and cost-effective.

System visibility requirements

Under the EO, businesses will be required to share system logs and monitoring data when requested. While there aren’t currently legal mandates outlining which data this includes, a thorough and holistic view of your systems will be required during any investigation.

With the order itself stating that “recommendations on requirements for logging events and retaining other relevant data” are soon to come, and shall include “the types of logs to be maintained, the time periods to retain the logs and other relevant data, the time periods for agencies to enable recommended logging and security requirements, and how to protect logs”, it’s clear that future cybersecurity legislation won’t be vague. Compliance requirements, wherever they’re applied, will be specific.

In the near future, businesses found to have critical system visibility blind spots could face significant legal ramifications. Especially if said blind spots become an exploited vulnerability in a national cybercrime or cybersecurity incident.

The legal onus will soon be on businesses to ensure their systems don’t contain invisible back doors into the wider national infrastructure. Your observability platform must provide full system visibility.

Secure services

The EO also included suggestions for software and service providers to create a framework for advertising security compliance as a marketable selling point.

While this mainly serves to create a competitive drive to develop secure software, it’s also to encourage businesses to be scrupulous about 3rd parties and software platforms they engage.

In the not-too-distant future, businesses utilizing non-compliant or insecure software or services will likely face legal consequences. Again, the ramifications will be greater should these insecure components be found to have enabled a successful cyberattack. Moving forward, businesses need to show 3rd party services and software they deploy unprecedented levels of scrutiny. 

Security should always be the primary concern. While this should have been the case anyway, the legal framework set out by Biden’s executive order means that investing in only the most secure 3rd party tools and platforms could soon be a compliance requirement.

Why now?

The executive order didn’t come out of the blue. In the last couple of years, there have been several high-profile, incredibly damaging cyberattacks on government IT suppliers and critical national infrastructure.

Colonial Pipeline Ransomware Attack

The executive order was undoubtedly prompted by the Colonial Pipeline ransomware attack. On May 7th, 2021, ransomware created by hacker group DarkSide compromised critical systems operated by the Colonial Pipeline Company. The following events led to Colonial Pipeline paying $4.4million in ransom, and the subsequent pipeline shutdown and slow operation period caused an emergency fuel shortage declaration in 17 states.

SolarWinds Supply Chain Attack

The Colonial Pipeline ransomware attack was the just latest high-impact cybercrime event with a national impact. In December 2020 SolarWinds, an IT supplier with government customers across multiple executive branches and military/intelligence services compromised their own system security with an exploitable update.

This ‘supply chain attack’ deployed trojans into SolarWinds customers’ systems through the update. The subsequent vulnerabilities opened a backdoor entrance into many highly classified government databases, including Treasury email traffic.

Why is it necessary?

While the damage of the Colonial Pipeline incident can be measured in dollars, the extent of the SolarWinds compromise has not yet been quantified. Some analysts believe the responsible groups could have been spying on classified communications for months. SolarWinds also had significant private sector customers including Fortune500 companies and universities, many of which could have been breached and still be unaware.

Again, these incidents are the latest in several decades marked by increasingly severe cyberattacks. Unless action is taken, instances of cybercrime that threaten national security will become not only more commonplace but more damaging.

Cybersecurity: An unprecedented national concern

Cybercrime is a unique threat. A single actor could potentially cause trillions of dollars in damages (assuming their goal is financial and not something more sinister). What’s more, the list of possible motivations for cybercriminals is far wider.

Whereas a state or non-state actor threatening US interests with a physical attack is usually politically or financially motivated (thus easier to predict), there have been many instances of ‘troll hackers’ targeting organizations for no reason other than to cause chaos.

When you factor this in with the constantly evolving global technical ecosystem, lack of regulation looks increasingly reckless. The threat of domestic terrorism is seen as real enough to warrant tight regulation of air travel (for example). Biden’s executive order is a necessary step towards cybercrime being treated as the equally valid threat it is.

Cybersecurity: A necessary investment long before Biden’s EO

Biden’s EO has shaken up how both the government and private sector are approaching cybersecurity. However, as the executive order itself and the events that preceded it prove, it’s a conversation that should have been happening much sooner.

The key takeaway for businesses from the executive order should be that none of the stipulations and requirements are new. There is no guidance in the EO which cybersecurity advocacy groups haven’t been espousing for decades.

Security, visibility, logging, and data storage/maintenance should be core focuses for your businesses’ IT teams already. The security of your systems and IT infrastructure should be paramount, before any attempts to optimize their effectiveness as a productivity and revenue boost.

Fortunately, compliance with any regulations the EO leads to doesn’t have to be a challenge. 3rd party platforms such as Coralogix offer a complete, end-to-end observability and logging solution which keeps your systems both visible and secure.

What’s more, the optimized costs and enhanced functionality over other platforms mean compliance with Biden’s EO needn’t be a return-free investment.

 

How Cloudflare Logs Provide Traffic, Performance, and Security Insights with Coralogix

Cloudflare secures and ensures the reliability of your external-facing resources such as websites, APIs, and applications. It protects your internal resources such as behind-the-firewall applications, teams, and devices. This post will show you how Coralogix can provide analytics and insights for your Cloudflare log data – including traffic, performance, and security insights.

To get all Cloudflare dashboards and alerts, follow the steps described below or contact our support on our website/in-app chat. We reply in under 2 minutes!

Cloudflare Logs

Cloudflare provides detailed logs of your HTTP requests. Use these logs to debug or to identify configuration adjustments that can improve performance and security.  You can leverage your rich Cloudflare log data through Coralogix’s User-defined Alerts and Data Dashboards to instantly discover trends and patterns within any given metric of your application-clients ecosystem, spot potential security threats, and get a real-time notification on any event that you might want to observe. Eventually, getting better Cloudflare monitoring experience and capabilities from your data, with minimum effort.

To start shipping your Cloudflare logs to Coralogix, follow this simple tutorial.

Cloudflare Dashboards

Once you’ve started shipping your Cloudflare logs to Coralogix, you can immediately extract insights and set up dashboards to visualize your data.

All Cloudflare logs are JSON logs. Based on a field or many fields, you may define your visualization(s) and gather them in a dashboard(s). The options are practically limitless and you may create any visualization you can think of as long as your logs contain that data you want to visualize. For more information, visit our Kibana tutorial.

There are nine out-of-the-box dashboards that are ready to use. You may import them with the following steps:

  1. Download the cloudflare_export.ndjson.zip file and save it locally.
  2. Unzip the file.
  3. Open cloudflare_export.ndjson with a text editor and replace all occurrences of *:index_pattern_newlogs* with your default pattern, for example: *:1111_newlogs*. By default, the index_pattern will be your company ID.
  4. Save the file.
  5. Login to Coralogix and click the Kibana button.
  6. Choose Management -> Saved Objects
  7. Click the Import button.
  8. Click the Import text in the section below Please select a file to import.
  9. Choose the cloudflare_export.ndjson file.
  10. Click the Import button at the bottom.

Notes:

  1. Some visualization may not be available if you didn’t specify them during Cloudflare Push Log Service configuration.
  2. If you want to be able to use visualizations where the country name field is used then you need to enable Geo Enrichment on the ClientIP key. Click here for more information or ping us on the chat to enable it for you
  3. If your application name is other than Cloudflare then you need to adjust the saved search filter “Saved Search – Cloudflare”.

Cloudflare – Snapshot

This is the main dashboard where you can take a look at your traffic. There are statistics about the total number of requests, bandwidth, cached bandwidth, threats, HTTP protocols, traffic types, and much more general information.

Cloudflare – Performance (Requests, Bandwidth, Cache), Cloudflare – Performance (Hostname, Content Type, Request Methods, Connection Type), Cloudflare – Performance (Static vs. Dynamic Content)

Monitor the performance – get details on the traffic. Identify and address performance issues and caching misconfigurations. Get your most popular hostnames, most requested content types, request methods, connection type, and your static and dynamic content, including the slowest URLs.

cloudflare performance dashboard cloudflare performance dashboard cloudflare performance dashboard

Cloudflare – Security (Overview), Cloudflare – Security (WAF), Cloudflare – Security (Rate Limiting), Cloudflare – Security (Bot Management)

Security dashboards let you track threats to your website/applications over time and per type/country. Web Application Firewall events will help you tune the firewall and prevent false positives. Rate Limiting protects against denial-of-service attacks, brute-force login attempts, and other types of abusive behavior targeting the application layer.

cloudflare security dashboardcloudflare security dashboard cloudflare security dashboard

Cloudflare – Reliability

Get insights into the availability of your websites and applications. Metrics include origin response error ratio, origin response status over time, percentage of 3xx/4xx/5xx errors over time, and more.

cloudflare reliability dashboard

Alerts

The user-defined alerts in Coralogix enable you to obtain real-time insights based on the criteria of your own choosing. Well-defined alerts will allow you and your team to be notified about changes in your website/applications. Here are some examples of alerts we created using Cloudflare HTTP Requests data.

1. No logs from Cloudflare

When Cloudflare stops sending logs for some reason, it is important for us to be notified.

Alert Filter: set a filter on the application name that represents your Cloudflare logs. In my case, we named it cloudflare.

Alert Condition: less than 1 time in 5 minutes

no cloudflare logs alerts

2. Bad Bots

Be notified about a high volume of bot requests

Alert Filter:
– Search Query: EdgePathingSrc.keyword:”filterBasedFirewall” AND EdgePathingStatus.keyword:”captchaNew”
– Applications: cloudflare

Alert Condition: more than 3 times in 5 minutes

3. Threats Stopped

Be notified about the threats which were stopped.

Alert Filter:
– Search Query: (EdgePathingSrc.keyword:”bic” AND EdgePathingOp.keyword:”ban” AND EdgePathingStatus.keyword:”unknown”) OR (EdgePathingSrc.keyword:”hot” AND EdgePathingOp.keyword:”ban” AND EdgePathingStatus.keyword:”unknown”) OR (EdgePathingSrc.keyword:”hot” AND EdgePathingOp.keyword:”ban” AND EdgePathingStatus.keyword:”ip”) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”ban” AND EdgePathingStatus.keyword:”ip”) OR (EdgePathingSrc.keyword:”user” AND EdgePathingOp.keyword:”ban” AND EdgePathingStatus.keyword:”ctry”) OR (EdgePathingSrc.keyword:”user” AND EdgePathingOp.keyword:”ban” AND EdgePathingStatus.keyword:/ip*/) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”captchaErr”) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”captchaFail”) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”captchaNew”) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”jschlFail”) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”jschlNew”) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”jschlErr”) OR (EdgePathingSrc.keyword:”user” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”captchaNew”)
– Applications: cloudflare

Alert Condition: more than 5 times in 10 minutes

cloudflare threats stopped alert

4. Threats vs Non-Threats ratio

Be notified if there are more than 10% of threats comparing to non-threats requests

Alert type: Ratio

Alert Filter:
– Search Query 1: (EdgePathingSrc.keyword:”bic” AND EdgePathingOp.keyword:”ban” AND EdgePathingStatus.keyword:”unknown”) OR (EdgePathingSrc.keyword:”hot” AND EdgePathingOp.keyword:”ban” AND EdgePathingStatus.keyword:”unknown”) OR (EdgePathingSrc.keyword:”hot” AND EdgePathingOp.keyword:”ban” AND EdgePathingStatus.keyword:”ip”) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”ban” AND EdgePathingStatus.keyword:”ip”) OR (EdgePathingSrc.keyword:”user” AND EdgePathingOp.keyword:”ban” AND EdgePathingStatus.keyword:”ctry”) OR (EdgePathingSrc.keyword:”user” AND EdgePathingOp.keyword:”ban” AND EdgePathingStatus.keyword:/ip*/) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”captchaErr”) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”captchaFail”) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”captchaNew”) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”jschlFail”) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”jschlNew”) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”jschlErr”) OR (EdgePathingSrc.keyword:”user” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”captchaNew”)

-Search Query 2: NOT ((EdgePathingSrc.keyword:”bic” AND EdgePathingOp.keyword:”ban” AND EdgePathingStatus.keyword:”unknown”) OR (EdgePathingSrc.keyword:”hot” AND EdgePathingOp.keyword:”ban” AND EdgePathingStatus.keyword:”unknown”) OR (EdgePathingSrc.keyword:”hot” AND EdgePathingOp.keyword:”ban” AND EdgePathingStatus.keyword:”ip”) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”ban” AND EdgePathingStatus.keyword:”ip”) OR (EdgePathingSrc.keyword:”user” AND EdgePathingOp.keyword:”ban” AND EdgePathingStatus.keyword:”ctry”) OR (EdgePathingSrc.keyword:”user” AND EdgePathingOp.keyword:”ban” AND EdgePathingStatus.keyword:/ip*/) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”captchaErr”) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”captchaFail”) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”captchaNew”) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”jschlFail”) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”jschlNew”) OR (EdgePathingSrc.keyword:”macro” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”jschlErr”) OR (EdgePathingSrc.keyword:”user” AND EdgePathingOp.keyword:”chl” AND EdgePathingStatus.keyword:”captchaNew”))

– Applications: cloudflare

Alert Condition: Alert if Query 1 / Query 2 equals more than 0.1 in 10 minutes

ratio alert type

alert type query

5. EdgeResponseStatus: more 4xx or 5xx than usually

EdgeResponseStatus field provides the HTTP status code returned by Cloudflare to the client

Alert Filter:
– Search Query: EdgeResponseStatus.numeric:[400 TO 599]
– Applications: cloudflare

Alert Condition: more than usual with threshold 10 times

edge response status alert

6. OriginResponseStatus more 4xx and 5xx than usually

OriginResponseStatus field is the HTTP status returned by the origin server

Alert Filter:
– Search Query: OriginResponseStatus.numeric:[400 TO 599]
– Applications: cloudflare

Alert Condition: more than usual with threshold 10 times

7. Longer DNS response time

Time taken to receive a DNS response for an origin name. Usually 0, but maybe longer if a CNAME record is used.

Alert Filter:
– Search Query: NOT OriginDNSResponseTimeMs.numeric:[0 TO 10]
– Applications: cloudflare

Alert Condition: more than 10 times in 10 minutes

What’s the Most Powerful Tool in Your Security Arsenal? 

Trying to work out the best security tool is a little like trying to choose a golf club three shots ahead – you don’t know what will help you get to the green until you’re in the rough.

Traditionally, when people think about security tools, firewalls, IAM and permissions, encryption, and certificates come to mind. These tools all have one thing in common – they’re static. In this piece, we’re going to examine the security tools landscape and understand which tool you should be investing in.

Security Tools – The Lay of the Land

The options available today to the discerning security-focused organization are diverse. From start-ups to established enterprises, understanding who makes the best firewalls or who has the best OWASP top ten scanning is a nightmare. We’re not here to compare vendors, but more to evaluate the major tools’ importance in your repertoire.

Firewalls and Intrusion Detection Systems

Firewalls are a must-have for any organization, no one is arguing there. Large or small, containerized or monolithic, without a firewall you’re in big trouble.

Once you’ve selected and configured your firewall, ensuring it gives the protection you need, you might think you’ve uncovered a silver bullet. The reality is that you have to stay on top of some key parameters to make sure you’re maximizing the protection of the firewall or IDS. Monitoring outputs such as traffic, bandwidth, and sessions are all critical to understanding the health and effectiveness of your firewalls.

Permissions Tooling

The concept of Identity and Access Management has evolved significantly in the last decade or so, particularly with the rise of the cloud. The correct provisioning of roles, users, and groups for the purposes of access management is paramount for keeping your environment secure.

Staying on top of the provisioning of these accesses is where things can get a bit difficult. The ability to understand (through a service such as AWS Cloudwatch) all of the permissions assigned to individuals, applications, and functions alike is difficult to keep track of. While public CSPs have made this simpler to keep track of, the ability to view permissions in the context of what’s going on in your system gives enhanced security and confidence.

Encryption Tooling

Now more than ever, encryption is at the forefront of any security-conscious individual’s mind. Imperative for protecting both data at rest and in flight, encryption is a key security tool. 

Once implemented, you need to keep track of your encryption, ensuring that it remains in place for whatever you’re trying to protect. Be it disk encryption or encrypted traffic on your network, it needs to be subject to thorough monitoring.

Monitoring is the Foundation

With all of the tool types that we’ve covered, there is a clear and consistent theme. Not only do all of the above tools have to be provisioned, they also rely on strong and dependable monitoring to assist proactive security and automation.

Security Incident Event Management

The ability to have a holistic view of all of your applications and systems is key. Not only is it imperative to see the health of your network, but if part of your application stack is underperforming it can be either symptomatic of, or inviting to, malicious activity.

SIEM dashboards are a vital security tool which use the concept of data fusion to provide advanced modelling and provide context to otherwise isolated metrics. Using advanced monitoring and altering, the best SIEM products will not only dashboard your system health and events in realtime, but also retain log data for a period of time to provide event timeline reconstruction.

The Power of Observability

Observability is the new thing in monitoring. It expands beyond merely providing awareness of system health and security status to giving cross-organizational insights which drive real business outcomes.

What does this mean for our security tooling? Well, observability practises drive relevant insights to the individuals most empowered to act on them. In the instance of system downtime, this would be your SREs. In the case of an application vulnerability, this would be your DevSecOps ninjas. 

An observability solution working in real time will not only provide telemetry on the health and effectiveness of your security tool arsenal, but will also give real-time threat detection. 

Coralogix Cloud Security

Even if you aren’t certain which firewall or encryption type comes out on top, you can be certain of Coralogix’s cloud security solution.

With a quick, 3-step setup and out-of-the-box functionality including real-time monitoring, you can be sure that your tools and engineers can react in a timely manner to any emerging threats. 

Easily connect any data source to complete your security observability, including Audit logs, Cloudtrail, GuardDuty or any other source. Monitor your security data in one of 100+ pre-built dashboards or easily build your own using our variety of visualization tools and APIs.

What to Consider When Monitoring Hybrid Cloud Architecture

Hybrid cloud architectures provide the flexibility to utilize both public and cloud environments in the same infrastructure. This enables scalability and power that is easy and cost-effective to leverage. However, an ecosystem containing components with dependencies layered across multiple clouds has its own unique challenges.

Adopting a hybrid log monitoring strategy doesn’t mean you need to start from scratch, but it does require a shift in focus and some additional considerations. You don’t need to reinvent the wheel as much as realign it.

In this article, we’ll take a look at what to consider when building a CDN monitoring stack or solution for a hybrid cloud environment.

Hybrid Problems Need Hybrid Solutions

Modern architectures are complex and fluid with rapid deployments and continuous integration of new components. This makes system management an arduous task, especially if your engineers and admins can’t rely on an efficient monitoring stack. Moving to a hybrid cloud architecture without overhauling your monitoring tools will only complicate this further, making the process disjointed and stressful.

Fortunately, there are many tools available for creating a monitoring stack that provides visibility in a hybrid cloud environment. With the right solutions implemented, you can unlock the astounding potential of infrastructures based in multiple cloud environments.

Mirroring Cloud Traffic On-Premise

When implementing your hybrid monitoring stack, covering blind spots is a top priority. This is true for any visibility-focused engineering, but blind spots are especially problematic in distributed systems. It’s difficult to trace and isolate root causes of performance issues with data flowing across multiple environments. Doubly so if some of those environments are dark to your central monitoring stack.

One way to overcome this is to mirror all traffic to/between external clouds to your on-premise environment. Using a vTAP (short for virtual tap), capture and copy data flowing between cloud components and feed the ‘mirrored’ data into your on-premise monitoring stack.

Traffic mirroring with implemented vTAP software solutions ensures that all system and network traffic is visible, regardless of origin or destination. The ‘big 3’ public cloud providers (AWS, Azure, Google Cloud) offer features that enable mirroring at a packet level, and there are many 3rd party and open source vTAP solutions readily available on the market.

Packet-Level Monitoring for Visibility at the Point of Data Transmission

As mentioned, the features and tools offered by the top cloud providers allow traffic mirroring down to the packet level. This is very deliberate on their part. Monitoring traffic and data at a packet level is vital for any effective visibility solution in a hybrid environment.

In a hybrid environment, data travels back and forth between public and on-premise regions of your architecture regularly. This can make tracing, logging, and (most importantly) finding the origin points of errors a challenge. Monitoring your architecture at a packet level makes tracing the journey of your data a lot easier.

For example, monitoring at the packet level picks up on failed cyclic redundancy checks and checksums on data traveling between public and on-premise components. Compromised data is filtered upon arrival. What’s more, automated alerts when packet loss spikes allow your engineers to isolate and fix the offending component before the problem potentially spirals into a system-wide outage.

Verifying data integrity and authenticity in real-time quickly identifies faulty components or vulnerabilities by implementing data visibility at the point of transmission. There are much higher levels of data transmission in hybrid environments. As such, any effective monitoring solution must ensure that data isn’t invisible while in transit.

Overcoming the Topology Gulf

Monitoring data in motion is key, and full visibility of where that data is traveling from and to is just as important. A topology of your hybrid architecture is far more critical than it is in a wholly on-premise infrastructure (where it is already indispensable). Without an established map of components in the ecosystem, your monitoring stack will struggle to add value.

Creating and maintaining an up-to-date topology of a hybrid-architecture is a unique challenge. Many legacy monitoring tools lack scope beyond on-premise infrastructure, and most cloud-native tools offer visibility only within their hosted service. Full end-to-end discovery must overcome the gap between on-premise and public monitoring capabilities.

On the surface, it requires a lot of code change and manual reconfigurations to integrate the two. Fortunately, there are ways to mitigate this, and they can be implemented from the early conceptual stages of your hybrid-cloud transformation.

Hybrid Monitoring by Design

Implementing a hybrid monitoring solution post-design phase is an arduous process. It’s difficult to achieve end-to-end visibility if the components of your architecture are deployed and in use.

One of the advantages of having components in the public cloud is the flexibility afforded by access to an ever-growing library of components and services. However, utilizing this flexibility means your infrastructure is almost constantly changing, making both discovery and mapping troublesome. Tackling this in the design stage ensures that leveraging the flexibility of your hybrid architecture doesn’t disrupt the efficacy of your monitoring stack.

By addressing real-time topology and discovery in the design stage, your hybrid-architecture and all associated operational tooling will be built to scale in a complimentary manner. Designing your hybrid-architecture with automated end-to-end component/environment discovery as part of a centralized monitoring solution, as an example, keeps all components in your infrastructure visible regardless of how large and complex your hybrid environment grows.

Avoiding Strategy Silos

Addressing monitoring at the design stage ensures that your stack can scale with your infrastructure with minimal manual reconfiguration. It also helps avoid another common obstacle when monitoring hybrid-cloud environments, that of siloed monitoring strategies.

Having a clearly established, centralized monitoring strategy keeps you from approaching monitoring on an environment-by-environment basis. Why should you avoid an environment-by-environment approach? Because it quickly leads to siloed monitoring stacks, and separate strategies for your on-premise and publicly hosted systems.

While the processes differ and tools vary, the underpinning methodology behind how you approach monitoring both your on-premise and public components should be the same. You and your team should have a clearly defined monitoring strategy to which anything implemented adheres to and contributes towards. Using different strategies for different environments quickly leads to fragmented processes, poor component integration, and ineffective architecture-wide monitoring.

Native Tools in a Hybrid Architecture

AWS, Azure, and Google all offer native monitoring solutions — AWS Cloudwatch, Google Stackdriver, and Azure Monitor. Each of these tools enables access to operational data, observability, and monitoring in their respective environments. Full end-to-end visibility would be impossible without them. While they are a necessary part of any hybrid-monitoring stack, they can also lead to vendor reliance and the aforementioned siloed strategies we are trying to avoid.

In a hybrid cloud environment, these tools should be part of your centralized monitoring stack, but they should not define it. Native tools are great at metrics collection in their hosted environments. What they lack in a hybrid context, however, is the ability to provide insight across the entire infrastructure.

Relying solely on native tools won’t provide comprehensive, end-to-end visibility. They can only provide an insight into the public portions of your hybrid-architecture. What you should aim for is interoperability with these components. Effective hybrid monitoring taps into these components to use them as valuable data sources in conjunction with the centralized stack.

Defining ‘Normal’

What ‘normal’ looks like in your hybrid environment will be unique to your architecture. It can only be established when analyzing the infrastructure as a whole. While the visibility of your public cloud components is vital, it is only by analyzing your architecture as a whole that you can define what shape ‘normal’ takes.

Without understanding and defining ‘normal’ operational parameters it is incredibly difficult to detect anomalies or trace problems to their root cause. Creating a centralized monitoring stack that sits across both your on-premise and public cloud environments enables you to embed this definition into your systems.

Once your system is aware of what ‘normal’ looks like as operational data, processes can be put in place to solidify the efficiency of your stack across the architecture. This can be achieved in many ways, from automating anomaly detection to setting up automated alerts.

You Can’t Monitor What You Can’t See

These are just a few of the considerations you should take when it comes to monitoring a hybrid-cloud architecture. The exact challenges you face will be unique to your architecture.

If there’s one principle to remember at all times, it’s this: you can’t monitor what you can’t see.

When things start to become overcomplicated, return to this principle. No matter how complex your system is, this will always be true. Visibility is the goal when creating a monitoring stack for hybrid cloud architecture, the same as it is with any other.

DevSecOps vs DevOps: What are the Differences?

The modern technology landscape is ever-changing, with an increasing focus on methodologies and practices. Recently we’re seeing a clash between two of the newer and most popular players: DevOps vs DevSecOps. With new methodologies come new mindsets, approaches, and a change in how organizations run. 

What’s key for you to know, however, is, are they different? If so, how are they different? And, perhaps most importantly, what does this mean for you and your development team?

In this piece, we’ll examine the two methodologies and quantify their impact on your engineers.

DevOps: Head in the Clouds

DevOps, the synergizing of Development and Operations, has been around for a few years. Adoption of DevOps principles has been common across organizations large and small, with elite performance through DevOps practices up 20%.

The technology industry is rife with buzzwords, and saying that you ‘do DevOps’ is not enough. It’s key to truly understand the principles of DevOps.

The Principles of DevOps

Development + Operations = DevOps. 

There are widely accepted core principles to ensure a successful DevOps practice. In short, these are: fast and incremental releases, automation (the big one), pipeline building, continuous integration, continuous delivery, continuous monitoring, sharing feedback, version control, and collaboration. 

If we remove the “soft” principles, we’re left with some central themes. Namely, speed and continuity achieved by automation and monitoring. Many DevOps transformation projects have failed because of poor collaboration or feedback sharing. If your team can’t automate everything and monitor effectively, it ain’t DevOps. 

The Pitfalls of DevOps

As above, having the right people with the right hard and soft skills are key for DevOps success. Many organizations have made the mistake of simply rebadging a department, or sending all of their developers on an AWS course and all their infrastructure engineers on a Java course. This doesn’t work – colocation and constant communication (either in person, via Slack or Trello) are the first enablers in breaking down silos and enabling collaboration. 

Not only will this help your staff cross-pollinate their expertise, saving on your training budget, but it enables the organic and seamless workflow. No two organizations or tech teams are the same, so no “one size fits all” approach can be successfully applied.

DevSecOps: The New Kid On The Block

Some people will tell you that they have been doing DevSecOps for years, and they might be telling the truth. However, DevSecOps as a formal and recognized doctrine is still in its relative infancy. If DevOps is the merging of Development and Operations, then DevSecOps is the meeting of Development, Security, and Operations. 

Like we saw with DevOps adoption, it’s not just as simple as sending all your DevOps engineers on a security course. DevSecOps is more about the knowledge exchange between DevOps and Security, and how Security can permeate the DevOps process. 

When executed properly, the “Sec” shouldn’t be an additional consideration, because it is part of each and every aspect of the pipeline.

What’s all the fuss with DevSecOps?

The industry is trending towards DevSecOps, as security dominates the agenda of every board meeting of every big business. With the average cost of a data breach at $3.86 million, it’s no wonder that organizations are looking for ways to incorporate security at every level of their technology stack.

You might integrate OWASP vulnerability scanning into your build tools, use Istio for application and container-level security and alerting, or just enforce the use of Infrastructure as Code across the board to stamp out human error.

However, DevSecOps isn’t just about baking Security into the DevOps process. By shifting security left in the process, you can avoid compliance hurdles at the end of the pipeline. This ultimately allows you to ship faster. You also minimize the amount of rapid patching you have to do post-release, because your software is secure by design.

As pointed out earlier, DevOps is already a successful methodology. Is it too much of a leap to enhance this already intimidating concept with security as well? 

DevOps vs DevSecOps: The Gloves Are Off

What is the difference between DevOps and DevSecOps? The simple truth is that in the battle royale of DevOps vs DevSecOps, the latter, newer, more secure contender wins. Not only does it make security more policy-driven, more agile, and more enveloping, it also bridges organizational silos that are harmful to your overall SDLC.

The key to getting DevSecOps right lies in two simple principles – automate everything and have omnipotent monitoring and alerting. The reason for this is simple – automation works well when it’s well-constructed, but it still relies on a trigger or preceding action to prompt that next function. 

Every single one of TechBeacon’s 6 DevSecOps best practices relies on solid monitoring and alerting – doesn’t that say a lot?

Coralogix: Who You Want In Your Corner

Engineered to support DevSecOps best practices, Coralogix is the ideal partner for helping you put security at the center of everything

Alerts API allows you to feed ML-driven DevOps alerts straight into your workflows, enabling you to automate more efficient responses and even detect nefarious activity faster. Easy to query log data combined with automated benchmark reports ensure you’re always on top of your system health. Automated Threat Detection turns your web logs into part of your security stack. 

With battle-tested software and a team of experts servicing some of the largest companies in the world, you can rely on Coralogix to keep your guard up.

Best Practices for Writing Secure Java Code

Every Java developer should follow coding standards and best practices to develop secure Java code. It is critical your code is not vulnerable to exploits or malicious attacks. In recent times, even big organizations like eBay, the CIA, and the IRS have fallen victim to vulnerabilities in their applications that have been discovered and exploited by attackers. 

The following guidelines provide a solid foundation for writing secure Java code and applications. These will minimize the possibility of creating security vulnerabilities caused by Java developers and help prevent known malicious attacks. 

1. Only Use Tried and Tested Libraries 

A large percentage of the code in applications is sourced from public libraries and frameworks. These libraries can contain vulnerabilities that may allow a malicious attacker to exploit your application. 

Organizations trust their business and reputation to the libraries they use, so make sure you only use proven ones and keep them up to date with the latest versions. Consider checking if they have any known vulnerabilities or require any security fixes.

2. Avoid Serialization

Java serialization is inherently insecure which is why Oracle recently announced it has a long-term plan to remove it. Serialization vulnerabilities were recently found in Cisco and Jenkins applications. 

Any application that accepts serialized Java objects is vulnerable, even if a library or framework is responsible and not your own Java code. One area to watch out for is making an interface serializable without thinking through what could be exposed. Another pitfall to avoid is accidentally making a security-sensitive class serializable, either by subclassing or implementing a serializable interface.

3. Always Hash User Passwords

Never store any passwords as plain text. Always hash user passwords preferably using a salted hash and a recommended hashing algorithm like SHA-2. When a password has been ‘hashed’, it has been turned into a scrambled version of itself. Using a predefined key known to the application, the hash value is derived from a combination of both the password and the key using a hashing algorithm.

4. Filter Sensitive Information From Exceptions

Exception objects can contain sensitive information that can assist an attacker hoping to exploit your system. An attacker can manufacture input arguments to expose internal structures and mechanisms of the application. It’s important to remember that information can be leaked from the exception message text and the type of an exception.

Take for example the FileNotFoundException message. These messages contain information about the layout of the file system and the exception type reveals the missing requested file. 

To secure Java code applications, you should filter both exception messages and exception type.

5. Do Not Log Sensitive Information

Data thefts cause massive harm to individuals and organizations, and developers need to do everything possible to prevent them from happening. Information like credit and debit card numbers, bank account numbers, passport numbers, and passwords are highly sensitive and valuable to criminals. Don’t store this type of information in log files and make sure it’s not detectable through searches in cleartext. 

If you have to log any sensitive information like card numbers, for example, think about logging only part of the card number e.g. the last four digits, and make sure it’s encrypted using a proven library. Don’t write your own encryption functionality. 

6. Error Handling and Logging 

You can accidentally reveal sensitive information in user error messages and error messages recorded in the log files, such as account information or system details. 

A safer way is to use generic screen error messages for users. Additionally, write log error messages that will help support teams investigating production issues without providing an attacker with useful information to further exploit your systems.

7. Write Simple Java Code

Generally speaking, simple java code is secure java code. Here are some tips on keeping your code simple and secure:

  • Keep it as simple as possible without reducing functionality. 
  • Use code quality checking products like SonarQube. This tool will continuously inspect the code quality whilst checking for any new vulnerabilities in your latest code release. Once a bug or vulnerability is in production, it is a lot harder to fix it compared to the effort to prevent it in the first place. 
  • Expose the minimum amount of information in your code. Hiding implementation details is good for keeping your code both secure and maintainable. 
  • Make good use of Java’s access modifiers. Declare the most restrictive access levels for classes, methods, and their attributes possible. Set everything that can be set to private, as private. 
  • Always define the smallest possible API and interface objects. Decouple components and make them interact in the smallest scope possible. If one component of your application is compromised by a breach, the others will be safe.

8. Prevent Injection Attacks

An injection attack occurs when malicious code is injected into the network. This type of attack is considered a major problem in web application security and is listed as the number one security risk in the OWASP Top 10. Any application that allows users to enter or upload data might contain a vulnerability that can allow an injection attack. Insufficient user input validation is usually the primary reason injection vulnerabilities exist. 

SQL Injection 

SQL Injection vulnerabilities are created when developers write dynamic database queries that can include user input. An attacker can include SQL commands in the input data, in any screen input field. Then because of a vulnerability in the code, the application runs the rogue SQL in the database. This gives attackers a way to bypass the application’s authentication functionality and allow them to retrieve the contents of an entire database. 

Key things to remember to prevent SQL injections: 

  • Never build SQL statements by concatenating arguments. This allows a high probability of SQL injection attacks.
  • Avoid dynamic SQL. Use Prepared Statements (with parameterized queries). 
  • Use stored procedures. 
  • Whitelist input validation. 
  • Escape user-supplied input. 

XPath Injection 

XPath injections are similar to SQL injections in that they can attack websites that operate on user-supplied information to construct an XPath query for XML data. An attacker can gain detailed information on how the XML data is structured or access data that is not normally accessible by sending malicious information to the website. 

These vulnerabilities can also elevate the attacker’s privileges in the application if the XML data is being used for authentication. 

You can avoid XPath injection by similar techniques used to prevent SQL injection: 

  • Sanitize all user input. 
  • When sanitizing, verify the data type, format, length, and content. 
  • In client-server applications, perform validation at both the client and the server sides.
  • Thoroughly test applications especially user input. 

Cross-Site Scripting 

Cross-Site Scripting (XSS) attacks happen when an attacker uses a web application to send malicious code (usually a browser-side script) to other users. Vulnerabilities that allow these attacks can occur anywhere a web application receives input from a user, within the output it generates, without validating or encoding it. 

To keep Java code applications secure and prevent XSS, filter your inputs with a whitelist of allowed characters and use a proven library to HTML encode your output for HTML contexts. For JavaScript use JavaScript Unicode escapes. 

Summary 

In summation, there are some key points to bear in mind in writing secure Java code. You should always think about security in the development of your application in the design stage and code reviews, as well as look for vulnerabilities in your Java code and take advantage of the Java security APIs and libraries.

Only ever use highly rated vendor tools to monitor and log your code for security issues. This means you should investigate the full list of application attack types and follow the recommended prevention methods.

If you use these guidelines for writing secure Java code applications in your organization, you can protect yourself and your applications against malicious attacks and data theft.

Network Security: The Journey from Chewiness to Zero Trust Networking

Network security has changed a lot over the years, it had to. From wide open infrastructures to tightly controlled environments, the standard practices of network security have grown more and more sophisticated.

This post will take us back in time to look at the journey that a typical network has been on over the past 15+ years. From a wide open, “chewy” network, all the way to zero trust networking.

Let’s get started.

Network Security in the Beginning…

Let’s say we work at a company that’s running a simple three-tiered web architecture. We have a frontend service, a backend service, and a database. It’s 2005, and we run each of our services and the database on a separate machine.

trusted network

The Problem? Our Network is Wide Open to Attack

An attacker gaining any sort of access to our network will be able to move throughout the entire network, exfiltrating data and causing havoc. Basically, our network is wide open to attack. Luckily for us, the fix is quite straightforward. At least, for now.

So, We Introduce Network Segmentation

Word reaches us about the new and improved security best practices, and we segment our network based on the “least privilege” principle. 

The Principle of Least Privilege

The principle of “least privilege” has become a staple of security thinking over the years. Its arguments are common sense. A service should have only the permissions that it needs to complete its function and nothing more. This sounds obvious, but from an engineering perspective, it is often easier to give something wide powers. This helps to avoid the need to revisit application permissions for every new feature change. 

From a networking perspective, the principle of least privilege argues that each server should only have the permissions and network access that it needs to run. Simple, right?

Applying the Principle of Least Privilege to our Network Security

least privilege security

We split each of our servers into its own network. This router makes sure that services can only communicate with their appropriate databases. This is important. It means that if an attacker manages to compromise one of the backend segments, they can not laterally move to the other segments. They can only visit the nodes that we have allowed. This limits the blast radius of an attack and makes for a much more difficult system to hack.

The Problem? Scale!

If we need to add any servers, or run additional services on these servers, the number of rules grows very quickly, as each rule needs to be duplicated by the number of machines. This poses a bit of an issue for us, but as long as the server count remains low, we should be alright. 

Unfortunately, over the past few years, architectural patterns like microservices have massively increased the number of virtual machines that we use. This increase makes this model far less viable, but we’ll get to that.

Moving to the Cloud

Our company, being savvy and technologically forward-thinking, decides to move to an early cloud offering. 

moving to the cloud

So far, so good! We’ve taken the lessons of our physical network segmentation and applied them here. Our instances are segmented from each other using security groups, which are basically firewall rules enforced by a virtual appliance. 

Our new infrastructure enables our company to handle more traffic and, ultimately, make more money. Soon, however, even our shiny cloud infrastructure needs some work. 

Introducing the Cluster

To deal with scale, our company containerizes its services. The services move to a managed Kubernetes cluster, and the database moves to a managed cloud database. 

cluster architecture

Very clean and elegant! (Yes, it’s oversimplified, but bear with me). Our services are now managed and leveraging auto-scale. 

The Problem? Still, Scale…

Our previous network setup relied on some pretty basic assumptions. We had a “Database server” and that server would host our database. We had a “Backend Server” and we could rely, 100% of the time, on our backend server hosting our backend application.

Now, however, there are no dedicated server roles. We find ourselves in a completely different paradigm. Our servers are simply agnostic hosts of docker containers and don’t know a great deal about the internals of these containers. So how do we set up rules to ensure a principle of least privilege?

The Solution? An Entirely New Approach

Before, we were thinking about servers in terms of their networking identity. For example, an IP address. This was adequate, but there is no longer a consistent mapping between IP address and service. We need to stop identifying services by some proxy variable, such as IP, and start identifying services directly. How do we do that, and how do we manage relationships between these identities?

service mesh

The Service Mesh

A service mesh is a relatively new entry into the networking scene. It operates at a much higher level than traditional networking does. Rather than worrying about the underlying switches, routes and such, it registers itself with each application in your system and processes rules about application to application communication. The service mesh has some specific differences from a traditional set up that must be understood.

Intention-Based Security

Intention-based security refers to a style of declaring security rules. Rather than having obscure networking rules that directly configure underlying switches, we declare the intention of each service, at the service level. For example, Service A wishes to communicate with Service B. 

This abstracts the underlying server and means that we no longer have to rely on assumptions about IP addresses. We can declare our intentions directly.

The Mesh Proxy (Sometimes Called Sidecar)

The service mesh is made up of a control plane and a network of proxies. The control plane configures the proxies, based on the intentions declared by the user. The proxies then intercept all traffic moving in and out of the service and, if the rules apply, will transform, reroute or block the traffic entirely. This network of proxies makes up the service mesh.

The Problem? Risk of System Disruptions

The most obvious drawback of this architecture is that there are many, many proxies. At least one per service. This means that a single broken proxy can disrupt all traffic for a service. We need to be able to monitor the health and configuration for each of these proxies, otherwise we will quickly lose the ability to track down problems.

The Solution? A Powerful Observability Platform

Observability is the answer here. You need a powerful platform, through which you can interrogate your infrastructure. You can find out about memory, CPU and networking status of your proxies and services, and ensure that your entire service mesh is running optimally.

Coralogix provides world class observability and regularly processes over 500,000 events, every second. If you need to level up your insights and gain new control over your system, check out how we can help.

Our Journey

We began with simple networking rules that blocked traffic from server to server. As we can see, the arms race between security and sophisticated software techniques continues. We have arrived at a point where we can directly declare services, by their human readable names, and control traffic through a network of distributed proxies.

Wherever you find yourself on this journey, the most exciting thing is that there is always somewhere new to go!

Stop Enforcing Security Standards – Start Implementing Policies

In days gone by, highly regulated industries like pharmaceuticals and finance were the biggest targets for nefarious cyber actors, due to the financial resources at banks and drug companies’ disposal – their respective security standards were indicative of this. Verizon reports in 2020 that, whilst banks and pharma companies account for 25% of major data breaches, big tech, and supply chain are increasingly at risk. 

Surely then, the way to protect against vulnerabilities and nefarious activities is to rigorously enforce security standards? Wrong.

In this piece we’re going to examine the landscape of information security policies today, and how new approaches and tools make security seamless.

Security Standards – As They Were

Security standards come in all shapes and sizes, some are relevant to developers, whilst some are more relevant to how a whole organization holds and handles data. Traditionally, security standards are enforced by an individual – typically an infosec or compliance persona. This approach has two flaws – the enforcer’s distance from the developers, and the broad strokes of infosec standards.

The Problem With the Old Way

Under this model, and particularly in big companies, information security and compliance is governed by separate teams or individuals. These people are normally non-technical and are logically separated from the development team. This means that the enforcers of security standards don’t always understand the implications of what they are enforcing, nor the people upon who they are imposing the standards.

Additionally, recent research has shown that the security standards that we all know are applied like blankets from industry to industry. With no specificity for development methodology, organizational resource, or data being handled, these overarching principles don’t engage the developers that should be adhering to them. 

All this comes down to a reliance on people, be it compliance professionals or developers, to understand, enforce, and implement these policies. This is not only a manual task, but it’s also onerous and doesn’t embrace the models of successful agile product development and release.

DevSecOps – A New Way

If you’re familiar with Disney’s The Mandalorian, then you’ll know the unending mantra of “this is the way”, uttered by all members of the secret sect. DevSecOps has shown the technology industry that dictated standards aren’t the only way.

A shift-left mentality, DevSecOps requires organizations to bridge the gap (and in some cases, absorb the space) between development and security. An article on the rise and success of DevSecOps outlined three defining criteria of a true DevSecOps environment. First, developers should be in charge of security testing. Second, fixing security issues should be wholly managed by the development team. Third, ongoing security-related issues should be owned by the development team. 

Simple enough, right? 

Whilst the principles for DevSecOps success are straightforward, the practices are often less so. Creating a secure-by-design architecture and coding security elements into your applications are key ways of breaking down the silos that security standards created.

Security and Development – How to Promote Cohesion

Gartner states that cultural and organizational roadblocks are the largest impeders to the unification of development and security operations individuals and teams. Looking at research from Gartner, surveyed CIOs and leading software vendors, security should be wrapped around DevOps practices, not just shoved into the mix.

From Principle to Practice to Policy

What does wrapping security around DevOps mean? In theory, it’s allowing the expertise of SecOps engineers and compliance professionals to impact development. In practice, it means allowing these professionals’ knowledge of the changing security and threat landscape to permeate in day-to-day DevOps activities.

Take Kubernetes for example. It provides network policies, which under the traditional model, may be allocated as part of an overarching InfoSec strategy. Neither dynamic nor totally secure, this is setting yourself up for failure. Implementing Zero Trust Networking is a DevSecOps mindset which, with tools like Istio, gives a service mesh providing both application and container-level security through alerting policies and active prevention. 

Alerting Makes DevSecOps Easy

Alerting is key. It takes away the idea of rigid security standards and instead provides the flexibility of implementable policies throughout the application and network layers. In an article covering the DevSecOps keys to success, there is one recurring theme – use whatever tools at your disposal to increase process speed. A mature monitoring and alerting system is the lynchpin to rapid security and development practices and provides the foundation for automation.

By integrating monitoring and alerting capabilities into a SIEM dashboard, security events can be analyzed in a cross-cutting way to tie together many extraneous factors which would otherwise be disparate. Adding automation, even something as simple as advanced messaging, on top shortens response time and guarantees uptime. When so much of DevSecOps is reliability engineering, your monitoring and alerting tool is the quarterback in your stack.

Coralogix is the Platform for Your New DevSecOps Culture

Out-of-the-box fully wrapped SaaS monitoring and altering with built-in threat detection“this is the way”.

Coralogix provides policy-based analysis to support your monitoring. On top of that, you get alerting with myriad integrations to plug into every component of your stack. This alerting allows for sophisticated policy creation based on security requirements, empowering a true DevSecOps mentality and workflow within your organization. 

Features like Flow Anomaly, ML-powered Dynamic Alerts, and a simple Alerts API mean you no longer need rigid security standards. Intelligent, inbuilt policies guarantee your applications and infrastructure can stay protected and progressive.

5 Common Elasticsearch Mistakes That Lead to Data Breaches

Avon and Family Tree aren’t companies you would normally associate with cybersecurity, but this year, all three were on the wrong side of it when they suffered massive data breaches. At Avon 19 million records were leaked, and Family Tree had 25GB of data compromised. What do they have in common? All of them were using Elasticsearch databases.

These are just the latest in a string of high profile breaches that have made Elasticsearch notorious in cybersecurity.  Bob Diachenko is a cybersecurity researcher. Since 2015, he’s been investigating vulnerabilities in NoSQL databases. 

He’s uncovered several high profile cybersec lapses including 250 million exposed Microsoft records. Diachenko’s research suggests that 60% of NoSQL data breaches are with Elasticsearch databases. In this article, I’ll go through five common causes for data breaches and show how the latest Elastic Stack releases can actually help you avoid them.

1. Always Secure Your Default Configuration Before Deploying

According to Bob Diachenko, many data breaches are caused by developers forgetting to add security to the default config settings before the database goes into production. To make things easier for beginner devs, Elasticsearch traditionally doesn’t include security features like authentication in its default configuration. This means that when you set up a database for development, it’s accessible to anyone who knows the IP address.

Avoid Sitting Ducks

The trouble starts as soon as a developer pushes an Elasticsearch database to the internet. Without proper security implementation, the database is a sitting duck for cyberattacks and data leaks. Cybersecurity professionals can use search engines like Shodan to scan for open IP ports indicating the presence of unsecured Elasticsearch databases. As can hackers. Once a hacker finds such a database, they can freely access and modify all the data it contains.

Developers who set up Elasticsearch databases are responsible for implementing a secure configuration before the database goes into production. Elasticsearch’s official website has plenty of documentation for how to secure your configuration and developers need to read it thoroughly.

Elasticsearch to the Rescue

That being said, let’s not put all the blame on lazy programmers! Elasticsearch acknowledges that the fast-changing cybersecurity landscape means devs need to take their documentation with a pinch of salt. Users are warned not to read old blogs as their advice is now considered dangerous. In addition, Elasticsearch security can be difficult to implement. Developers under pressure to cut times to market won’t necessarily be incentivised to spend an extra few days double checking security.

To combat the threat of unsecured databases, Elasticsearch have taken steps to encourage secure implementation as a first choice. Elastic Stack 6.8 and 7.1 releases come with features such as TLS encryption and Authentication baked into the free tier. This should hopefully encourage “community” users to start focussing on security without worrying about bills. 

2. Always Authenticate

In 2018, security expert Sebastien Kaul found an Elasticsearch database containing tens of millions of text messages, along with password information. In 2019, Bob Diachenko found an Elasticsearch database with over 24 million sensitive financial documents. Shockingly, neither database was password protected.

So why are so many devs spinning up unauthenticated Elasticsearch databases? On the internet! In the past, the default configuration didn’t include authentication. Devs used the default configuration because it was convenient and free.

To rub salt on the wound, Elasticsearch told users to implement authentication by placing a Nginx server between the client and the cluster. This approach had the downside that many programmers found setting up the correct configuration much too difficult for them.

Recognising the previous difficulties, Elasticsearch has recently upgraded the free configuration. It now includes native and file authentication. The authentication takes the form of role based access control. 

Elasticsearch developers can use Kibana to create users with custom roles demarcating their access rights.  This tutorial illustrates how role based access control can be used to create users with different access rights.

3. Don’t Store Data as Plain Text

In his research, Bob Dianchenko found that Microsoft had left 250 million tech support logs exposed to the internet. He discovered personal information such as emails had been stored in plain text.  

In 2018, Sebastien Kaul found an exposed database containing millions of text messages containing plain text passwords.

Both of these are comparatively benign compared to Dianchenko’s most recent find, a leaked database containing 1 billion plain text passwords. With no authentication protecting it, this data was ripe for hackers to plunder. Access to passwords would allow them to commit all kinds of fraud, including identity theft.

Even though storing passwords in plain text is seriously bad practice, many companies have been caught doing it red handed. This article explains the reasons why.

Cybersecurity is No Laughing Matter

In a shocking 2018 twitter exchange, a well-known mobile company admitted to storing customer passwords in plain text. They justified this by claiming that their customer service reps needed to see the first few letters of a password for confirmation purposes.

When challenged on the security risks of this practice, the company rep gave a response shocking for its flippancy.

“What if this doesn’t happen because our security is amazingly good?”

Yes, in a fit of poetic justice, this company later experienced a major data breach.  Thankfully, such a cavalier attitude to cybersecurity risks is on the wane.  Companies are becoming more security conscious and making an honest attempt to implement security best practice early in the development process. 

Legacy Practices

A well-known internet search engine stored some of it’s account passwords in plain text. When found out, they claimed the practice was a remnant from their early days. Their domain admins had the ability to recover passwords and for this to work, needed to see them in plain text.

Although company culture can be slow to change, many companies are undertaking the task of bringing their cybersecurity practices into the 21st century.

Logging Sensitive Data

Some companies have found themselves guilty of storing plain text passwords by accident. A well-known social media platform hit this problem when it admitted it had been storing plain text passwords. The platform’s investigation concluded:

“…we discovered additional logs of [the platform’s] passwords being stored in a readable format.”

They had inadvertently let their logging system record and store usernames and passwords as users were typing the information. Logs are stored in plain text, and typically accessible to anyone in the development team authorised to access them. Plain text user information in logs invited malicious actors to cause havoc.

On this point, make sure to use a logging system with strong security features. Solutions such as Coralogix are designed to conform to the most up to date security standards, guaranteeing the least risk to your company.

Hashing and Salting Passwords

In daily life we’re warned to take dodgy claims “with a pinch of salt” and told to avoid “making a hash of” something. Passwords on the other hand, need to be taken with more than a pinch of salt and made as much of a hash of as humanly possible.

Salting is the process of adding extra letters and numbers to your password to make it harder to decode. For example, imagine you have the password “Password”. You might add salt to this password to make it “Password123” (these are both terrible passwords by the way!)

Once your password has been salted, it then needs to be hashed. Hashing transforms your password to gibberish. A company can check the correctness of a submitted password by salting the password guess, hashing it, and checking the result against the stored hash. However, cybercriminals accessing a hashed password cannot recover the original password from the hash.

4. Don’t Expose Your Elasticsearch Database to the Internet

Bob Diachenko has made it his mission to find unsecured Elasticsearch databases, hopefully before hackers do!  He uses specialised search engines to look for the IP addresses of exposed databases. Once found, these databases can be easily accessed through a common browser.

Diachenko has used this method to uncover several high profile databases containing everything from financial information to tech support logs. In many instances, this data wasn’t password protected, allowing Diachenko to easily read any data contained within. Diachenko’s success dramatically illustrates the dangers of exposing unsecured databases to the internet.

Because once data is on the web, anyone in the world can read it. Cybersecurity researchers like Bob Diachenko and Sebastien Kaul are the good guys. But the same tools used by white-hat researchers can just as easily be used by black-hat hackers.

If the bad guys find an exposed database before the good guys do, a security vulnerability becomes a security disaster. This is starkly illustrated by the shocking recent tale of a hacker who wiped and defaced over 15000 Elasticsearch servers, blaming a legit cybersecurity firm in the process. 

The Elasticsearch documentation specifically warns users not to expose databases directly to the internet. So why would anyone be stupid enough to leave a trove of unsecured data open to the internet?

In the past, Elasticsearch’s tiering system has given programmers the perverse incentive to bake security into their database as late as possible in the development process. With Elastic Stack 6.8 and 7.1, Elasticsearch have included security features in the free tier. Now developers can’t use the price tag as an excuse for not implementing security before publishing, because there isn’t one.

5. Stop Scripting Shenanigans

On April 3 2020, ZDNet reported that an unknown hacker had been attempting to wipe and deface over 15,000 Elasticsearch servers. They did this using an automated script.

Elasticsearch’s official scripting security guide explains that all scripts are allowed to run by default. If a developer left this configuration setting unchanged when pushing a database to the internet, they would be inviting disaster.

Two configuration options control script execution, script types and script contexts. You can prevent unwanted script types from executing with the command script.allowed_types: inline

To prevent risky plugin scripts from running, Elasticsearch recommends modifying the script contexts option using script.allowed_contexts: search, update.  If this isn’t enough you can prevent any scripts from running you can set script.allowed_contexts to “none”.

Elasticsearch takes scripting security issues seriously and they have recently taken their own steps to mitigate the problem by introducing their own scripting language, Painless. 

Previously, Elasticsearch scripts would be written in a language such as JavaScript. This made it easy for a hacker to insert malicious scripts into a database.  Painless brings an end to those sorts of shenanigans, making it much harder to bring down a cluster.

Summary

Elasticsearch is one of the most popular and scalable database solutions on the market. However, it’s notorious for its role in data breaches. Many of these breaches were easily preventable and this article has looked at a few of the most common security lapses that lead to such breaches.

We’ve seen that many cases of unsecured databases result from developers forgetting to change Elasticsearch’s default configuration before making the database live. We also looked at the tandem issue of unsecured databases being live on the web, where anyone with the appropriate tools could find them.  

Recently, Elasticsearch have taken steps to reduce this by including security features in their free tier so programmers are encouraged to consider security early. Hopefully this alone provides developers a powerful incentive to address the above two issues.

Other issues we looked at were the worryingly common habit of storing passwords as plain text instead of salting and hashing them and the risks of not having a secure execution policy for scripts. These two problems aren’t Elasticsearch specific and are solved by common sense and cybersecurity best practice.

In conclusion, while Elasticsearch has taken plenty of recent steps to address security, it’s your responsibility as a developer to maintain database security.