Keeping IaC Secure: Common Security Risks in Infrastructure as Code

What is infrastructure as code (IaC)?

Infrastructure as Code (IaC) is the cloud-computing practice of putting the provisioning and configuring your cloud resources into machine-readable code. 

Rather than manually configuring each server, container, database, or API gateway (to name just a few of the resources that can be managed using IaC), only to repeat that process when you need to update the settings or deploy another instance, all your requirements and settings are recorded in code (typically YAML or JSON files) which you can apply as many times as you need.

Defining your infrastructure in software with IaC makes replicating your virtual machines, containers, and services efficient and consistent. Because of this, IaC is a key enabler of a fully automated DevOps process. 

By putting environment configuration and deployment logic into code, you can automatically refresh a pre-production environment before running automated tests on the latest build of your application and create multiple identical testing and staging environments for an automated CI/CD pipeline.

Programming infrastructure allows you to make the most of cloud-hosted resources because you can provision new services with minimal effort. For example, suppose you’re managing a containerized deployment with Kubernetes or Docker Swarm. In that case, putting your configuration into code means you can scale up your cluster with minimal effort, knowing that each new node will be provisioned consistently. 

Furthermore, storing IaC configuration files in source control means you can easily revert to an earlier version when needed and ensures you have an audit log of all changes to your infrastructure setup.

Tools such as AWS CloudFormation, Hashicorp’s Terraform, and Microsoft’s Azure Resource Manager allow you to use Infrastructure as Code to automate provisioning and configuration of cloud-hosted infrastructures, such as Amazon EC2 instances, Microsoft Azure resources, or Docker containers.

Common security risk areas in IaC Technology

As more and more organizations turn to cloud-based infrastructure to host their systems and services, Infrastructure as Code (IaC) is being used more and more widely. But for all the advantages of Infrastructure as Code, it can also present risks. 

Misconfigurations or failing to change the default behavior of cloud resources can introduce security vulnerabilities and expose your organization to cyberattacks. Multi-cloud setups increase the complexity of systems, making it a challenge to maintain a comprehensive understanding of your computing estate and the impact that a change will have, increasing the likelihood of unintended consequences.

The following are some of the most common security risk areas for Infrastructure as Code.

Hardcoded secrets

Storing the credentials or private keys required to connect to particular services directly in your IaC configuration files is a severe security failing which unfortunately happens far too often. It usually occurs because authentication details have been hard-coded as a “temporary” solution while the code was still in development, only to be forgotten.

The result is that secrets remain visible in the configuration files, which are then added to source control. Even with access controls to your source code, this is too risky a place to store credentials. You’re effectively handing the keys to any hacker wishing to gain access to your systems.

Elevated privileges

As with authentication secrets, it’s not uncommon for developers to grant root privileges to services in the early phases of development simply because it’s quicker than creating a role that only has the required permissions. 

However, granting more permissions than are needed to the resources you create means that a malicious actor who has gained access to your system can more easily start using and modifying your system for their needs.

Default templates

Suppose you’re using existing templates to create cloud resources such as VMs or containers. In that case, there is a high chance that the template will include insecure default configurations, such as public SSH ports or unencrypted data. Use templates from trusted sources and always check that the settings are what you need before using them as the basis for your infrastructure.

Insecure communications

Several IaC management tools use a centralized controller that communicates with multiple nodes to configure resources. In these cases, it’s essential to secure the communications between the control plane and worker nodes using SSL/TLS to prevent them from being visible to anyone who cares to look.

Unencrypted data

In addition to securing data in transit, your IaC configuration should ensure that any sensitive data is encrypted at rest. Even if access is secured, it’s prudent to have multiple layers of security just in case one of your defenses is breached.

Configuration drift

Configuration drift refers to discrepancies between defining the infrastructure in code and what is deployed. It usually results from manual changes to resources directly rather than updating the code and deploying the changes as part of an automated IaC process. 

Bypassing the normal process usually means your changes have not gone through your standard automated tests and security checks (more on that below), increasing the likelihood that a security flaw will be introduced.

Tips for implementing security in IaC

Adopting IaC ensures that your infrastructure is always configured consistently and means you can easily audit your settings to ensure they meet your requirements. As long as you apply security best practices to your configuration files, using infrastructure as code will improve your security posture by reducing the likelihood of security flaws in your infrastructure. 

On the other hand, failing to apply security best practices to IaC means any security flaws will propagate through your estate, creating multiple opportunities for an attacker. Let’s look at some key security practices for IaC.

Adopt a DevSecOps mindset

Infrastructure as Code is a DevOps practice that applies software development techniques to operations tasks; defining your infrastructure in software, putting it through a series of automated tests, and automatically deploying it provides you with frequent opportunities for rapid feedback on the changes you’ve made to your resources so that you can address any issues as early as possible.

DevSecOps extends this approach to security considerations. Tools such as Static Analysis Security Testing (SAST) allow you to check IaC files for known vulnerabilities as you write them so that you can address issues immediately. 

Having checked the static code, you can put your IaC changes through an automated CI/CD pipeline and use Dynamic Analysis tools (DAST) to test the code at runtime and get rapid feedback before changes are deployed. Building security testing into the development process for IaC significantly reduces the likelihood of security issues in production.

Store secrets securely

Rather than hardcoding secrets in your configuration files, use a secure storage mechanism such as AWS Secrets Manager or Microsoft’s Azure Key Vault and reference these from your IaC files. IDE security plugins for IaC can check for insecurely stored secrets and remind developers to move credentials out of the code and into a secret vault.

Apply the principle of least privilege

Applying the principle of least privilege means granting only the permissions that are required and no more. For example, you might only assign read permissions if you need a resource to retrieve values from a database. Setting broader permissions than are required – such as write or delete – might be easier, but it also provides opportunities for malicious actors that have gained access to your system to misuse resources.

Applying least privilege also means de-provisioning accounts when they are no longer needed. Creating an automated onboarding and offboarding process helps ensure that developer accounts no longer in use do not remain available for someone to hijack.

Tag all resources

Automatically tagging all your cloud resources as they are created not only ensures that you can identify the owner of each VM, container, database, file share, or gateway if an issue arises but also reduces the likelihood of unmanaged ghost resources. Untagged resources can make managing costs difficult and present a security risk. 

Any misuse – such as cryptocurrency mining – can easily go undetected for weeks. When you spot potentially suspicious activity, tracking down the team owning the resource will take much longer to confirm whether the resource should be terminated.

Enforce policies

If your organization must adhere to particular standards ­– such as HIPAA, GDPR, or ISO27001 – use policy guardrails for auditing your configuration for compliance as part of your automated pre-deployment checks.

Put all IaC changes through the pipeline

Having developed an automated pipeline that checks for security flaws such as hardcoded secrets and validates that your infrastructure complies with relevant policies, it’s vital to ensure that the process is never bypassed. 

When changes need to be made urgently, it’s safer to put them through the pipeline to verify them before deployment than to make a manual change that may introduce unexpected issues. 

Furthermore, because IaC is version controlled in the same way as application code, avoiding manual configuration ensures you always have an audit log of all changes and can roll back to an earlier version if needed.

Collect and analyze logs

Finally, collecting and analyzing log files from your cloud resources in real-time will allow you to monitor your infrastructure and identify issues in production as they emerge.

Benefits of achieving IaC security

Building security into the development process for IaC reduces the likelihood of vulnerabilities in your production system and, therefore, the potential for attacks. This leaves operations teams more capacity to check the production infrastructure for security issues and address emerging vulnerabilities.

When changes need to be made, having an automated process that checks your IaC for security flaws means you can deploy updates quickly and consistently while minimizing the risk of further issues.

For organizations required to comply with particular regulations – for example, on handling personally identifiable information or health data – building security into the pipeline enables you to demonstrate continuous compliance with relevant policies with little manual effort.

Top IaC security tools

An essential part of securing your infrastructure as code is using tools that can alert you to security flaws. You can choose between individual tools to target particular areas described below or platforms designed to address multiple stages of the IaC lifecycle, such as Bridgecrew and Tenable CS.

IDE plugins

IDE plugins provide you with feedback on your code while you’re developing. Addressing issues while you’re writing the changes is more efficient than waiting until later in the development cycle, so it’s good practice to use plugins that check for known security issues as part of your development process. Tools such as TFLint, Docker Linter, Checkov, Snyk, and Cycode integrate with the developer workflow to provide rapid feedback on IaC configurations.

IaC template scanners

To provide immediate feedback, template scanners check VM and container images for known security flaws. Examples include CloudSploit and Anchore.

Cloud Security Posture Management Tools

Finally, Cloud Security Posture Management (CSPM) tools provide security checks at runtime by continuously checking your cloud deployments for misconfigurations. Some also check for drift detection to alert you when changes have been made to resources without going through the normal IaC pipeline. Tools such as CloudGuard, Falcon Horizon, and Cloud One Conformity.

Take Away

Infrastructure as code offers many advantages, but it’s important to understand your cloud systems and security protections. If you’re using templates, ensure they don’t introduce vulnerabilities. Using security tools and automated tests will help to guard against misconfigurations and identify flaws early, reducing opportunities for malicious actors to attack your system. 

How IaC helps integrate Coralogix with Terraform

Infrastructure as Code is an increasingly popular DevOps paradigm. IaC has the ability to abstract away the details of server provisioning. This tutorial will look at how Coralogix can be used with the popular IaC tool Terraform.

Terraform

Terraform, a tool we’ve previously talked about, is Hashicorp’s answer to the problem of server provisioning.  It uses the powerful paradigm of Infrastructure as Code (IaC). IaC abstracts and simplifies the traditional process of setting up and configuring servers by representing server configurations as code files.

This brings a range of benefits to DevOps teams such as automating deployment processes, providing effective infrastructure documentation, and enabling infrastructure validation.

Terraform itself is a binary that makes API calls to providers. These are services such as Coralogix or AWS which Terraform tells to perform particular tasks. Users can interact with the providers using the Terraform CLI or by setting up configuration files.

Terraform Configuration Language

Terraform represents the various objects of your infrastructure as resources. These are stored in configuration documents. The syntax of a typical configuration document looks something like this:

resource "aws_vpc" "main" {
  cidr_block = var.base_cidr_block
}

Terraform language components come in three types: blocks, arguments, and expressions.  

Blocks

Blocks are containers for other content.

<BLOCK TYPE> "<BLOCK LABEL>" "<BLOCK LABEL>" {
  # Block body
  <IDENTIFIER> = <EXPRESSION> # Argument
}

Blocks are comprised of three elements. The Block Type tells you that it is a Terraform resource. Block labels function as tags and a block can have multiple labels.  The body of a block is where the content is stored. This content could be arguments or expressions.

Arguments and expressions

Arguments assign a value to a name while expressions are statements that combine values. In the first example cidr_block = var.base_cidr_block is an argument.

Terraform’s configuration language is declarative.  As Yevgeniy Brikman explains, this shows its advantage when making configuration changes. For example, if you wanted to deploy 10 EC2 instances you might use the following configuration file:

resource "aws_instance" "example" {
  count = 10
  ami = "ami-40d28157"
  instance_type = "t2.micro"
}

You can change the number simply by editing the count argument. If you wanted 15 EC2 instances instead of ten you can write count = 15 without worrying about the configuration change history.

Using Terraform

Terraform has a range of useful applications.  For one, it can simplify the setup of Heroku applications. Heroku is popular due to its ability to scale apps using dynos, but building anything complex quickly requires lots of add-ons.

Terraform, by using IaC, can make setting up these add-ons much simpler. Heroku add-ons can be specified in a Terraform configuration document. Terraform even allows you to do fancy stuff like using Cloudflare as a CDN for your application.

Another use is 2-tier applications that involve a pool of web servers using a database.  For these to run successfully the connection between servers and database must be seamless.  Additionally, both tiers must be up and running to execute functionality. You don’t want any of your servers trying to hit a database that isn’t there.

With IaC, Terraform can handle the infrastructure, ensuring the necessary dependencies are in place and that the database is up before servers are provisioned. Plus, this can all be done with a few configuration documents. 

Coralogix

Many Terraform applications produce lots of logging data. As a case in point, Heroku logs are notorious for the amount of data they generate.  To really reap the benefits of IaC in your application development, you need good observability.

This is where Coralogix (which has pre-built Heroku visualizations) comes in. It uses machine learning to automatically extract patterns and trends from data.  

Using Coralogix with Terraform

As with other systems that it integrates with, you can use Terraform to interact with Coralogix, first by configuring it with the Coralogix provider, and second, by setting rules and alerts through the Terraform CLI.

Coralogix Provider

The Coralogix provider enables you to define rules and alerts through Terraform’s IaC paradigm.  If you have Terraform at or later than 0.13, the code for the provider is:

terraform {
  required_providers {
    coralogix = {
      source  = "coralogix/coralogix"
      version = "~> 1.0"
  }
 }
}

If you have Terraform 0.12 or earlier, the following code should be used:

# Configure the Coralogix Provider
provider "coralogix" {
    api_key = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}

The value of the API key is stored in an environment variable called API_KEY. If you’re an admin user, you can generate an API key from the Coralogix dashboard by going to Settings -> Account and clicking on API Access. This will let you create an Alerts & Rules API key.Since the API key is a sensitive value you can use infrastructure as code management platform to store the value securely.

Along with the API key, there are two optional arguments. url contains the Coralogix API URL which is stored in the environment variable API_URL. timeout is an argument specifying when the Coralogix API will time out. This information is stored in the CORALOGIX_API_TIMEOUT environment variable.

Log Parsing Rules

It’s important for DevOps engineers to effectively manipulate logging data. In Coralogix, this is enabled through log parsing rules.  These are rules for processing, parsing, and restructuring log data. Rules come in various types, for example, parse rules allow you to create secondary logs based on data from primary logs.

Coralogix contains rules in Rules Groups. These are structures that contain sets of rules, along with a Rule Matcher which ensures only the desired logs are processed by queries.

Manipulating Rules Groups with Terraform

Terraform allows users to create, read, update, and delete Coralogix Rules Groups through its Coralogix Rules Groups resource.

In this example, we are creating a group called “My Group”.

# Create "My Group" Rules Group
resource "coralogix_rules_group" "rules_group" {
    name    = "My Group"
    enabled = true
}

In addition to the arguments included in the example, there are two optional arguments. The description argument allows you to add a description summarizing the Group’s purpose. The creator argument shows who created the rules group.

Manipulating Rules

Terraform lets you play not just with Rules Groups but with Rules themselves using this data source.

data "coralogix_rule" "rule" {
    rule_id        = "e1a31d75-36ab-11e8-af8f-02420a00070c"
    rules_group_id = "e10ef9d1-36ab-11e8-af8f-02420a00070c"
}

Rules can be created in the following way.

# Create "My Rule" Rule
resource "coralogix_rule" "example" {
    rules_group_id = "e10ef9d1-36ab-11e8-af8f-02420a00070c"
    name           = "My Rule"
    type           = "extract"
    description    = "My Rule created with Terraform"
    expression     = "(?:^|[\s"'.:\-\[\]\(\)\{\}])(?P<severity>DEBUG|TRACE|INFO|WARN|WARNING|ERROR|FATAL|EXCEPTION|[I|i]nfo|[W|w]arn|[E|e]rror|[E|e]xception)(?:$|[\s"'.:\-\[\]\(\)\{\}])"
    rule_matcher {
        field      = "applicationName"
        constraint = "prod"
    {
}

As with Rules Groups, rules have a name, description, and enabled flag.  They also have three other arguments.

Rules_group_id contains the id of the rules group that the rule belongs to. This allows users to know what Rules Group a rule is part of and re-assign rules to different rules groups.

The type specifies what type the rule is.  As explained at the beginning of this section, log parsing rules can come in different types. In this case, the rule type is “extract,” meaning that the rule is designed to extract information to a log and append additional fields to it.

The expression contains the rule itself in the form of a regular expression. In the above example, the rule is designed to search for logs containing words including DEBUG, TRACE, WARNING, and EXCEPTION.

Alerts

A key feature of Coralogix is the ability to create alerts. They enhance observability by alerting DevOps engineers whenever a parameter leaves its optimal state. Terraform lets users define Coralogix alerts with the Coralogix alert resource.

data "coralogix_alert" "alert" {
    alert_id        = "3dd35de0-0e10-11eb-9d0f-a1073519a608"
}

Here is how to create an alert.

# Create "My Alert" Alert
resource "coralogix_alert" "example" {
    name     = "My Alert"
    severity = "info"
    enabled  = true
    type     = "text"
    filter {
        text         = ""
        applications = []
        subsystems   = []
        severities   = []
}
    condition {
        condition_type = "more_than"
        threshold      = 100
        timeframe      = "30MIN"
}
    notifications {
        emails = [
            "user@example.com"
        ]
    }
}

Just like Rules and Rules Groups, each alert has a name argument and an enabled flag. Moreover, there are plenty of additional arguments to determine the properties of the alert.

There are two required arguments in addition to the name and enabled.

The type determines the alert type. This can be either “text” or “ratio.” Text alerts simply provide a message when a system parameter exceeds a certain threshold. For example, Coralogix provides dynamic alerts, which update the threshold using machine learning.

Ratio alerts are slightly more complex. They let you calculate a ratio between two log queries, something that can be useful in areas ranging from system health to marketing.

Severity specifies the alert’s urgency. It can take three values, which include the following –  “Info” means the alert simply provides information and the user is under no pressure to act on it. “Warning” is for when the alert provides a warning such as disk space is about to be used up. “Critical” would be for alerts that require immediate action, such as a system outage.

There are four block arguments; arguments whose values require Terraform Configuration Language blocks.

The filter defines what input the alert needs to respond to. This could be particular logs or application behavior. The block contains four optional fields. Text specifies the string query to be alerted on. 

Users can decide what applications and subsystems the alert should respond to with the applications and subsystems fields. The severities field enables users to list the log severity levels they want to be alerted on.

Condition is where users can define the threshold that triggers the alert. It has three required fields; condition_type works like a relational operator in Java or Python and threshold specifies the number of log occurrences that should trigger the alert. Timeframe determines how long after the event the alert can be triggered.

Schedule determines when the alert should be triggered while notifications control who gets notified about the alert. 

Wrapping Up

In this tutorial, we’ve seen how Coralogix can be used in tandem with the popular IaC tool Terraform.  Coralogix can be integrated with Terraform through the Terraform Coralogix provider and Terraform provides plenty of enabling features to use key aspects of Coralogix, like rules and alerts.

The power of Infrastructure as Code is that it allows you to configure DevOps infrastructure with the same ease that you write code. Being able to apply that to observability is a very powerful tool.