Heroku Continuous Integration & Deployment with Docker [Hands-On Tutorial]

In this tutorial, we will be using Heroku to deploy our Node.js application through CircleCI using Docker. We will set up Heroku Continuous Integration and Deployment (CI/CD) solution pipelines using Git as a single source of truth.

Containerization allows developers to create and deploy applications faster with a wide range of other benefits like increased security, efficiency, agility to integrate with DevOps pipelines, portability, and scalability. Hence, companies and organizations are largely adopting this technology and it’s resulting in benefits for developers and removing overhead for the operations team in the software development lifecycle.

Benefits of having an Heroku CI/CD pipeline:

  1. Improved code quality and testability
  2. Faster development and code review
  3. Risk mitigation
  4. Deploy more often

Requirements

To complete this tutorial you will need the following:

  1. Git and GitHub
  2. Heroku free tier account
  3. CircleCI
  4. Node.js development environment
  5. Docker and Docker Compose
  6. Redis

Overview

The application architecture is always fundamental and important so we can properly align our various components in a systematic way to make it work. Let’s define the architecture of our example application so that we can have a proper vision before writing our code and integrating CI/CD pipelines.

We’ll be using Docker to deploy our Node.js application. Docker provides the flexibility to ship the application easily to cloud platforms. We will be using CircleCI as going to be our Heroku Continuous Integration and Deployment tool where we will run our unit and integrations tests. If tests are passed successfully, we are good to deploy our application on Heroku using Container Runtime.

Setup Application

Let’s start working on this by creating a small Node.js project. I’ve already created a small Node.js application that uses Redis to store key/value pairs. You can find the application on GitHub repo under the nodejs-project branch.

You can fork and clone the repo and setup Node.js application locally by running the following–

git clone -b nodejs-project https://github.com/ankitjain28may/heroku-dockerize.git
cd heroku-dockerize

Our application uses Redis for storing key/values pairs so if you have Redis locally installed and running, we can test this application once locally before dockerizing it by running the following command:

npm start
# Output
> heroku-dockerize@1.0.0 start /heroku-dockerize
> node server.js

Example app listening at https://localhost:3000
Redis client connected

Let’s browse our application by opening the URL https://localhost:3000/ping in the browser. If we don’t have Redis configured locally, no problem, we have your back. We will be running this application using Docker Compose locally before pushing it to Git.

Dockerize Application

This is our own Node.js application and there is no Docker image available right now so let’s write our own by creating a file named `Dockerfile`.

FROM node:14-alpine

RUN apk add --no-cache --update curl bash
WORKDIR /app

ARG NODE_ENV=development
ARG PORT=3000
ENV PORT=$PORT

COPY package* ./
# Install the npm packages
RUN npm install && npm update

COPY . .

# Run the image as a non-root user
RUN adduser -D myuser
USER myuser

EXPOSE $PORT

CMD ["npm", "run", "start"]

We will be using an official node:14-alpine image from Dockerhub as our base image. We have preferred the alpine image because it’s minimal and lightweight as it doesn’t have additional system packages/libraries which we can always install on top of it using the package manager.

We have defined `NODE_ENV, PORT` as build arguments. We can install only the application dependencies by setting `NODE_ENV=production` while building an image for the production environment. This will install only the application dependencies leaving all the devDependencies.

Note: I am also running `npm update` along with `npm install`, it’s not generally a best practice to run `npm update` as it’s possible that it might break our application due to some updated dependencies while running in production which can result in downtime. 

Also, we will be setting up our Heroku Continuous Integration pipeline where we will be running our integration tests and code linting where such issues can be caught before deploying our application.

Awesome! Let’s build our image using the following command:

docker build -t heroku-dockerize:local --build-arg PORT=3000 .

Let’s check our image by running `docker images` command–

docker images

The output will be something similar as shown in the image below:

We can now run our application container using the image created above by running the following command:

docker run --rm --name heroku-dockerize -p 3000:3000 heroku-dockerize:local 

We can see the output as shown in the image below–

Oops!!! We’ve dockerized our Node.js application but we haven’t configured Redis yet. We need to create another Docker container using the Redis official docker image and create a network between our application container and the Redis container.

Too many things to take care of, so here comes Docker Compose to help. Compose is a tool for defining and running multi-container applications with ease; removing the manual overhead of creating networks, attaching containers to the network, etc.

Let’s set up Docker Compose by creating a file named `docker-compose.yaml`

version: "3.4"

networks:
  internal:

services:
  web:
    build:
      context: .
      args:
        - NODE_ENV=production
    container_name: heroku-dockerize-web
    
    networks:
      - internal
    ports:
      - "3000:${PORT}"
    env_file:
      - .env
    depends_on:
      - db
    restart: always
    healthcheck:
      test: ["CMD", "curl", "-f", "https://localhost:${PORT}/ping"]
      interval: 10s
      timeout: 10s
      retries: 3
      start_period: 10s
  db:
    image: redis:latest
    container_name: heroku-dockerize-redis
    networks:
      - internal
    restart: always
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 1s
      timeout: 3s
      retries: 3
      start_period: 1s

We have defined our two services named `web` for our Node.js application and `db` for our redis.

Add the following variables to the .env file

PORT=3000
REDIS_URL=redis://db:6379

It is always recommended to store the sensitive values or credentials in `.env` file and add this file in `.gitignore` and `.dockerignore` so we can restrict its circulation. 

Let’s create a `.gitignore` and `.dockerignore` file and add the following–

node_modules
.env
.git
coverage

Awesome, Let’s spin our containers using docker-compose–

docker-compose up -d

We can check whether both services are up and running by running the following command —

docker-compose ps

Let’s browse our application and make some dummy request to save data to our redis service. 

curl -X POST 
      https://localhost:3000/dummy/1 
      --header 'Content-Type: application/json' 
      --data '{"ID":"1","message":"Testing Redis"}'
# Output
Successful

We have successfully saved our data in our redis, Let’s make a get request to `/dummy/1` to check our data–

curl -X GET https://localhost:3000/dummy/1
#Output
{"ID":"1","message":"Testing Redis"}

Note: We can use Postman to make an API request.

Before moving forward, let’s discuss the pros and cons of Heroku Container Runtime

Heroku Container Runtime

Heroku traditionally uses the Slug Compiler which compresses our code, creates a “slug” and distributes it to the dyno manager for execution. This is a very common understanding of what Slug Compiler does. With the increase in the adoption of Containerization and the benefits of it, Heroku has also announced Heroku Container Runtime which allows us to deploy our Docker images.

With Heroku Container Runtime, we can have all the benefits of Docker like the isolation of our application from the underlying OS, increased security, flexibility to create our custom stack. With Docker, we have an assurity that the application running inside the container on our local machines will surely run on the cloud/other machines, as the only difference between them is of the environment i.e config files or environment variables.

With great powers come great responsibility

Similar to this popular quotes, We have more responsibility to maintain and upgrade our Docker image, fixing any security vulnerability, unlike the traditional slug compiler. Other than that, Heroku Container Runtime is not yet mature as it doesn’t support most of the cool features of Docker. These are some of the major missing features that we use with Docker —

  1. Volume Mounts is not supported and the filesystem is ephemeral.
  2. Networking linking across dynos is not supported.
  3. Only the web process (web dyno) is exposed to traffic and PORT is dynamically set by Heroku during runtime as an environment variable so an application must listen to that port, else it will fail to run.
  4. It doesn’t respect the EXPOSE command in Dockerfile.

We will be using Heroku Container Runtime for deploying our application along with Heroku Manifest. With Heroku manifest, we can define our Heroku app and set up addons, config variables, and structure our images to be built and released.

Let’s define our Heroku Manifest `heroku.yml` file. 

setup:
  addons:
    - plan: heroku-redis
      as: REDIS
  config:
    APP_NAME: heroku-dockerize
build:
  docker:
    web: Dockerfile
  config:
    NODE_ENV: production
run:
  web: npm start

As we are using Redis to store key/value pairs, we can define the Heroku Redis add-ons. Under the `setup` field, we can create global configurations and add-ons. Under the `build` field, we have defined our process name and the Dockerfile used to build our Docker image. We can also pass build arguments to our Dockerfile under the `config` field. We haven’t used the `release` field as we are not running any migrations or pre-release tasks. Finally, we have defined our dyno and command to be executed in the `run` field.

Note: If we don’t define the `run` field, the CMD command provided in Dockerfile is used.

Let’s create our app on Heroku. Hope you have installed the Heroku CLI and installed the `@heroku-cli/plugin-manifest` plugin. If not, please refer to this doc

Run the following command to create our app from Heroku manifest file with all the add-ons and config variables–

# Login to heroku using heroku CLI
$ heroku login

# Create our heroku app via manifest
$ heroku create heroku-dockerize --manifest

# Output
Reading heroku.yml manifest... done
Creating ⬢ heroku-dockerize... done, stack is container
Adding heroku-redis... done
Setting config vars... done
https://heroku-dockerize.herokuapp.com/ | https://git.heroku.com/heroku-dockerize.git

Awesome, we have created our app on Heroku, it has also set our stack to a container. Let’s move to the next section and create our CI/CD pipeline to deploy our Node.js Application. 

Heroku Continuous Integration

Continuous Integration (CI) is the process of automating the build and running all our test suites including unit tests, integration tests and also performs code validations and linting to maintain the code quality. If our code successfully passes all the checks and tests, it is good to be deployed to production.

We will be using CircleCi for creating our Heroku CI/CD pipelines. CircleCi is the cloud-based tool to automate the integration and deployment process. Heroku also offers CI/CD pipeline but as we discussed above, Heroku Container Runtime is not very mature and still under development and hence, it is not possible to test container builds via Heroku CI. We can use other third-party CI/CD tools like Travis, GitHub Actions etc.

CircleCI is triggered whenever we make a push event to the GitHub repo via webhooks. We can control the configure the webhooks to be triggered on different events like push, pull requests etc. CircleCI looks for the `.circleci/config.yml` file which contains the instructions/commands to be processed. Let’s create a `.circleci/config.yml` file–

version: 2.1
orbs:
  node: circleci/node@3.0.1
jobs:
  build-and-test:
    executor: node/default
    steps:
      - checkout
      - node/install-packages
      - run: 
          name: Running npm test
          command: |
            npm run test
      - run:
          name: Codecov coverage
          command: |
            npm run coverage
      - setup_remote_docker:
          version: 18.06.0-ce
      - run:
          name: Build Docker image
          command: |
            docker build -t heroku-dockerize:circleci .
            docker images
workflows:
  build-test-deploy:
    jobs:
      - build-and-test

CircleCI has recently introduced reusable packages written in YAML called “Orbs” to help developers with setting up their CI/CD pipeline easily and efficiently. It also allows us to integrate third-party tools effectively in our pipelines. In the above CircleCI config, we have used one such orb called `circleci/node` to easily install Node.js and its package manager. 

We have set up the CI pipeline to build and test our application, it will first install the application dependencies and then run all our integration tests. We have also added the step to create the Docker image so we can check for our Docker image.

Let’s push this CircleCI config to our GitHub repo by running the following command–

git add .circleci/config.yml
git commit -m”Integrating CI pipeline”
git push origin master

We will now enable this repo from our CircleCI dashboard simply by clicking “Set Up Project” button as shown in the image below– 

Once we click on this button, it will ask us to add a config file as shown in the image below:

Click on the “Use Existing Config” button as we have already added the CircleCI config and click on “Start Building”. It will start the build for your Node.js application. We can also run pre-tests scripts or post-tests script simply by adding another pre/post-run field.

Awesome, all tests were passed; we are good to deploy this application to our Heroku. Let’s move to our next step of integrating the CD pipeline to deploy this project to our Heroku application.

Integrating The CD Pipeline

In the previous step, we have configured our CI pipeline to make the builds and run all the tests so it can be deployable to the production environment. CI pipelines help the reviewers to review the code easily as all the tests are passed. It also allows the developers to have better visibility in the application code if something breaks without being blocked at the other developers to review.

Let’s deploy our application to Heroku by running the following command–

# deploy on Heroku
$ git push heroku master

# Open the website
$ heroku open

# Check the logs
$ heroku logs -a heroku-dockerize

Heroku has this great feature of deploying applications directly using Git. As we can see this is being done manually by us and it’s never good to deploy our application to production environments without running proper tests and reviews. Let’s add our CD pipeline.

Before adding the pipeline, we need to authenticate the request which CircleCI makes to deploy our application on Heroku. This can be done by setting these two environment variables–

  1. HEROKU_APP_NAME: Name of our Heroku app. 
  2. HEROKU_API_KEY: This is our Heroku API key which will serve as the token for authentication and verifies our request.

We can find the API Key under the Heroku Account Settings

Let’s add them as Environment Variables in our CircleCI Project Settings Page as shown in the image below.

Open the `.circleci/config.yml` file and setup the deploy pipeline —

version: 2.1
orbs:
  heroku: circleci/heroku@1.0.1
jobs:
  deploy:
    executor: heroku/default
    steps:
      - checkout
      - run:
          name: Storing previous commit
          command: |
            git rev-parse HEAD > ./commit.txt
      - heroku/install
      - setup_remote_docker:
          version: 18.06.0-ce
      - run:
          name: Pushing to heroku registry
          command: |
            heroku container:login
            heroku container:push web --arg NODE_ENV=production -a $HEROKU_APP_NAME
            heroku container:release web -a $HEROKU_APP_NAME
workflows:
  build-test-deploy:
    jobs:
      - deploy:
          requires:
            - build-and-test
          filters:
            branches:
              only:
                - master

We are now using circleci/heroku orb to install Heroku CLI and deploy the application. We can also see that this orb supports the deploy-via-git command to deploy our application as we did above manually, but still, we are login to Heroku container registry and deploying our images. This is because we have run a pre-script to create a `/commit` endpoint which shows the deployed commit by rendering the data from `commit.txt` file.

Under workflows, we have made sure that our code is only deployed for the `master` branch and provided that `build-and-test` pipeline i.e our CI pipeline should be run successfully.

Awesome, Our final `.circleci/config.yml` file–

version: 2.1
orbs:
  node: circleci/node@3.0.1
  heroku: circleci/heroku@1.0.1
jobs:
  build-and-test:
    executor: node/default
    steps:
      - checkout
      - node/install-packages
      - run: 
          name: Running npm test
          command: |
            npm run test
      - run:
          name: Codecov coverage
          command: |
            npm run coverage
      - setup_remote_docker:
          version: 18.06.0-ce
      - run:
          name: Build Docker image
          command: |
            docker build -t heroku-dockerize:circleci .
            docker images
  deploy:
    executor: heroku/default
    steps:
      - checkout
      - run:
          name: Storing previous commit
          command: |
            git rev-parse HEAD > ./commit.txt
      - heroku/install
      - setup_remote_docker:
          version: 18.06.0-ce
      - run:
          name: Pushing to heroku registry
          command: |
            heroku container:login
            heroku container:push web --arg NODE_ENV=production -a $HEROKU_APP_NAME
            heroku container:release web -a $HEROKU_APP_NAME
workflows:
  build-test-deploy:
    jobs:
      - build-and-test
      - deploy:
          requires:
            - build-and-test
          filters:
            branches:
              only:
                - master

Let’s push our code with the Continuous Deployment configuration added to our CircleCI configuration. Our build will start to run as soon as we push. Let’s check the build–

Our pipeline is running and waiting for the CI pipeline (build-and-test) to be run successfully. Once CI pipeline is finished with success, Our CD pipeline (deploy) will run and deploy our application on Heroku.

Great, Our CD pipeline is working perfectly. Let’s open our application and check for the `/commit` endpoint as well as we can also verify our application’s other endpoint like saving data to Redis as a key/value pair. Browse the application at https://heroku-dockerize.herokuapp.com

We have added the coverage and some extra tests under the `tests` branch which is merged to master now.

As we can see that it has skipped the CD pipeline (deploy) and only ran the CI pipeline (build-and-test). This is because we have defined that CD will run only for the master branch once the CI pipeline runs successfully. You can check the PR here #3

Conclusion

In this post, we first understand how we can dockerize our application and set up our local development environment using docker-compose. Further, we learned about the Heroku Container Runtime and how we can use it to automate the process of deploying our containerized applications to Heroku by using the Continuous Integration & Deployment pipeline. It will eventually increase the productivity of developers and help the teams to deploy their projects easily. 

The code is available in this Github repo. The repo also contains the integration with Travis CI under `.travis.yml` file if anyone is willing to integrate it with Travis.

You can follow Ankit on Twitter and Medium.

Ruby logging best practices and tips

Ruby is an opinionated language with inbuilt Ruby logging monitoring options that will serve the needs of small and basic applications. Whilst there are fewer alternatives to these than say, the JavaScript world, there are a handful, and in this post, I will highlight those that are active (based on age and commit activity) and help you figure out the options for logging your Ruby (and Rails applications).

Before proceeding, take note that this article was written using Rails v4.x; later versions of Rails may not be supported.

Best logging practices – general

Before deciding what tool works best for you, following broad logging best practices will also apply to Ruby logging and will help make anything you do log more useful when trying to track down a problem. Read “The Most Important Things to Log in Your Application Software” and the introduction of “JAVA logging – how to do it right” for more details, but in summary, these rules are:

  • Enable logging: This sounds obvious, but double check you have enabled logging (in whatever tool you use) before deploying your application and don’t solely rely on your infrastructure logging.
  • Categorize your logs: As an application grows in usage, the quantity of logs it generates will grow and the ability to filter logs to particular categories or error levels such as authorization, access, or critical can help you drill down into a barrage of information.
  • Logs are for everyone: Your logs are useful sources of information for a variety of stakeholders including support and QA engineers, and new programmers on your team. Keep them readable, understandable and with a clear purpose.

Inbuilt options

Ruby ships with two inbuilt methods for logging application flow, puts, most suited to command line applications, and logger, for larger, more complex applications.

puts takes one parameter, the object(s) you want to output to stdout, with each item outputting to a new line:

puts(@post.name, @post.title)

logger provides a lot of options for an inbuilt class, and Rails enables it by default.

logger.info #{@post.name}, #{@post.title}

The logger class provides all the log information you typically see when running Rails. You can set levels with each message, and log messages above a certain level.

require ‘logger’
logger = Logger.new(STDOUT)
logger.level = Logger::WARN

logger.debug(“No output”)
logger.info(“No output”)
logger.warn(“Output”)
logger.fatal(“Output”)

Logger doesn’t escape or sanitize output, so remember to handle that yourself. For details on how to do this, and create other forms of custom loggers and message formatters, read the official docs

Taming logger with Lograge

Whilst many Rails developers find it’s default logging options essential during development, in production, it can be noisy, overwhelming, and at worst, unhelpful. Lograge attempts to reduce this noise to more salient and useful information, and into a format that is less human-readable, but is more useful to external logging systems if you use its JSON formatted output option.

There are many ways you can initialize and configure the Gem, I stuck to the simplest, by creating a file in config/initializers/lograge.rb with the following content:

Rails.application.configure do
  config.lograge.enabled = true
end

Which changes the output to this.

ruby logging best practices

Lograge output

Unsurprisingly there are a lot of configuration options to tweak the logging output to suit you based on values available in the logging event. For example to add a timestamp:

Rails.application.configure do
  config.lograge.enabled = true

  config.lograge.custom_options = lambda do |event|
    { time: event }
  end
end

You can also add custom payloads into the logging information for accessing application controller methods such as request and current_user.

config.lograge.custom_payload do |controller|
  {
    # key_name: controller.request.*,
    # key_name: controller.current_user.*
    # etc…
  }
end

If adding all this information already feels counter to the point of lograge, then it also gives the ability to remove information based on certain criteria. For example:

config.lograge.ignore_actions = [‘PostsController#index’, ‘VisitorsController#new’]
  config.lograge.ignore_custom = lambda do |event|
    # return true if you want to ignore based on the event
  end

Logging with logging

Drawing inspiration from Java’s log4j library, logging offers similar functionality to the inbuilt logger, but adds hierarchical logging, custom level names, multiple output destinations and more.

require ‘logging’

logger = Logging.logger(STDOUT)
logger.level = :warn

logger.debug “No output”
logger.warn “output”
logger.fatal “output”

Or to create custom loggers that output to different locations and assigned to different classes.

require ‘logging’

Logging.logger[‘ImportantClass’].level = :warn
Logging.logger[‘NotSoImportantClass’].level = :debug

Logging.logger[‘ImportantClass’].add_appenders
    Logging.appenders.stdout,
    Logging.appenders.file(‘example.log’)

class ImportantClass
  logger.debug “I will log to a file”
end

class  NotSoImportantClass
  logger.debug “I will log to stdout”
end

Next Generation Logging with Semantic Logger

One of the more recent projects on this list, semantic_logger aims to support high availability applications and offers a comprehensive list of logging destinations out of the box. If you are using Rails, then instead use the rails_semantic_logger gem that overrides the Rails logger with itself. There are a lot of configuration options, where you can sort log levels, tags, log format, and much more. For example:

config.log_level = :info

# Send to Elasticsearch
SemanticLogger.add_appender(
  appender: :elasticsearch,
  url:      ‘https://localhost:9200’
)

config.log_tags = {
  #  key_name: :value,
  #  key_name:       -> request { request.object[‘value’] }
 }

Logging to external services

With all the above options you will still need to parse, process and understand your logs somehow, and numerous open source and commercial services can help you do this (open your favorite search engine and you’ll find lots), I’ll highlight those that support Ruby well.

If you’re a fluentd user, then there’s a Ruby gem that offers different ways to send your log data. If you’re a Kibana user, then Elastic offers a gem that integrates with the whole ELK stack.

Papertrail has a gem that extends the default logger to send logs to their remote endpoint. They haven’t updated it in a while, but it still their official solution, so should work, and if it doesn’t they offer an alternative method.

Loggly uses lograge and some custom configuration to send logs data to their service.

And for any Airbrake users, the company also offers a gem for direct integration into their service.

There are also a handful of gems that send the default ruby logs to syslog, which then enables you to send your logging data to a large amount of external open source and commercial logging services.

And of course, Coralogix’ own package allows you to create different loggers, assign a log level to them and other useful metadata. In addition to all standard logging features, such as flexible log querying, email alerts, centralized live tail, and a fully hosted Kibana, Coralogix provides machine learning powered anomaly detection in the context of software builds.

Another benefit is that Coralogix is the only solution which offers straightforward pricing, all packages include all features.

First, create an initializer in initializers/coralogix.rb with your account details, set a default class and extend the default Rails logger:

require ‘coralogix_logger’
PRIVATE_KEY = “<PRIVATE_KEY>”
APP_NAME = “Ruby Rails Tester”
SUB_SYSTEM = “Ruby Rails Tester Sub System”

*Private key is received upon registration, **Application name separates environments, ***Subsystem name separates components.

Coralogix::CoralogixLogger.configure(PRIVATE_KEY, APP_NAME, SUB_SYSTEM)
Rails.application.config.coralogix_logger =
Coralogix::CoralogixLogger.get_logger(“feed-service”)
Rails.logger.extend(ActiveSupport::Logger.broadcast(Rails.application.config.coralogix_logger))

And then in each class, we recommend you get an instance of the logger in each class and set the logger name to the class name, which Coralogix will use as the category. The log() method offers options to tailor the logging precision, but the severity and message are mandatory:

logger = Coralogix::CoralogixLogger.get_logger(“Posts Controller”)
logger.log(Coralogix::Severity::VERBOSE, “Post created #{post.inspect})

You can also use severity methods if you prefer, where only the message is mandatory, but other options are available:

logger.verbose(“Post created #{post.inspect})

And there you have it:

Coralogix logs stream

Know your Code

Whilst Ruby and Rails lack the ‘cool factor’ they had in the past, depending on where you look it still claims the 5th most used language (it peaked at 4th place in 2013 anyway), 12th place in the IEEE spectrum and 4th place on GitHub. It’s still a relevant, and widely used language, especially on certain platforms, such as web backends and Heroku. This means your code should be as optimized as possible. And of course, your Ruby logging/Rails logging should be as organized as possible. I hope this post will help you track down the source of potential problems in the future.

Quick Tips for Heroku Logging with Coralogix

TL;DR: Coralogix released a new addon for Heroku: https://elements.heroku.com/addons/coralogix. In this post we cover best Logging practices, how different cloud providers enable them, what makes Heroku logging special, and how Coralogix utilizes Heroku’s approach to provide it’s 3rd generation logging experience. 

Lost in the Clouds

Widespread cloud hosting opened a world of possibilities for developers, reducing the need to maintain, monitor and scale their own infrastructure, and instead focus on building their applications. When cloud services are working the way you expect, then the ability to not have to worry about what is happening behind the scenes is a liberating one, but alongside this also gives your lack of control and when you experience a problem (and you will), tracing its source can be a challenge. A typical source of information is your log files, but as infrastructure, locations and services constantly shift around the World, where do you look, how do you access them and what should you log that’s relevant to such a setup? In this post, I will mainly focus on Heroku, and compare it’s very different way of handling logging to most other cloud providers.

Logging Best Practices

Even with fully cloud-based applications, most normal rules of logging apply, and following them will make tracing issues in your application easier to identify. Read “The Most Important Things to Log in Your Application Software” and “JAVA logging – how to do it right” for more details, but in summary, these rules are:

  • Enable logging: This sounds like an obvious rule in an article about logging, but double check you have enabled logging before deploying your application and don’t solely rely on your infrastructure logging.
  • Categorize your logs: Especially in a distributed system, the ability to filter logs to particular categories, or error levels such as authorization, access, or critical can help you drill down into a barrage of information.
  • Logs are not only for you: Your logs are useful sources of information for a variety of stakeholders including support and QA engineers, and new programmers on your team. Keep them readable, understandable and clear as to their purpose.
  • Use 3rd party systems: There are dozens and dozens of 3rd party logging tools to help you consolidate and understand your logs better. From open source to SaaS options, you will find a solution that suits your needs.
  • Use standard logging libraries: whether you are writing to file or using a log aggregator, always prefer a standard logging library over doing your own log printing, we ofter see mistakes like using console.log which causes break-lines to be read as separate lines, etc. There are many great logging libraries out there for most languages, prefer them over implementing your own logging. 

Amazon Web Services

In true Amazon fashion, AWS has its own custom solution for logging that integrates well with all aspects of its service but requires learning new tools and processes. CloudWatch Logs aggregates real-time and streamed logs from various AWS services into one convenient location, and adds alerting to CloudTrail based on defined conditions and events. CloudWatch has integrations with about a dozen 3rd party partner services including Splunk and DataDog, but if your provider of choice isn’t on that list then you have to rely on CloudWatch or parse the logs yourself from an S3 bucket.

Microsoft Azure

Azure’s logging offering is comprehensive, offering logging at most levels of the technology stack, in a self-contained monitoring UI. Azure then allows you to take this a step further with log analytics and for larger teams, the Operations Management Suite expands log analytics into an entire operations suite.

Google Compute Engine

Google Compute Engine offers Stackdriver that’s based on the open source fluentd logging layer. If you’re a GCE user, then it’s your best option, with direct connections to Google’s storage and analysis products. Intriguingly it also works with AWS (but not Azure), making it an ideal candidate, if your application runs across multiple clouds (and it should). Stackdriver receives log entries from multiple sources across the GCE stack, including VMs and applications. You can analyze your logs via a semi-searchable GUI, an API, or CLI tool.

Heroku

Heroku takes a different approach to cloud-based application development, shifting the focus from infrastructure to applications. Combined with Heroku’s tools for Docker, Git and CLI workflows, it makes developing applications for testing and production a breeze. However, the ephemeral nature of Heroku makes knowing what’s going on behind the scenes of your application something of a mystery at times.

By default logging consists of using the subcommand heroku logs to view output, and with a variety of parameters to filter the source or dyno (Heroku parlance for an instance), but once you add multiple dynos (which you invariably will as you scale) getting to grips with what’s going on in your logs can be difficult. Filtering the log output is a start, but you need to know where to look before you start looking, and generally, you resort to logs because you’re unsure what the problem is.

To aid the process Heroku offers the ‘Log Drains’ feature to forward raw log output to 3rd party providers, or providers of your own creation. This makes initial log usage harder, but by removing an official logging solution, offer far more flexibility over your specific needs in the longer term.

Coralogix and Heroku

Coralogix has recently released a logging add-on that forwards all Heroku output to a Coralogix account you specify.

provision coralogix on heroku

Or use the CLI with:

heroku addons:create coralogix:test
Then within a few seconds, any logs that your application generates will feed right into Coralogix, ready for use the high scale, high-speed search and analytics that Coralogix offers. Here’s a simple example, that shows that even with a small application generating simple log records, as soon it starts scaling, logs become hard to traverse and understand.

Sample Application

The sample application consists of a Python and Ruby application that randomly generates a cat from the Cat API every 2 seconds and logs the value.

I started with one instance of each, then scaled to three instances, here are the logs using heroku logs one one instance.

one worker sending logs to coralogix

And another example of mixed logs with messages from Heroku, debug messages and errors, already getting confusing.

two workers sending logs to Coralogix

Throw in two more workers and tracking down the source of messages is getting even more confusing. And the sample application has two applications running three workers each.

multiple workers on Coralogix

Let’s use Coralogix instead. You can open the dashboard page, or again use the CLI and see the initial display right after you start sending logs:

CLI: heroku addons:open coralogix

coralogix dashboard

view heroku logs in coralogix

After 24 hours, Coralogix’ Loggregation feature will kick in to automatically cluster log entries and make finding what you need faster and easier. In the screenshot below you can see that Loggregation has identified that despite being ‘random’, the CatAPI repeats itself a lot, outputting the same value over 1,000 times. In this case, it’s not important, but in a mission critical application, this helps you identify patterns clearly.

loggregation on heroku logs

Errors and broken flows are often introduced by new versions and builds of an application. Coralogix’s integration with Heroku includes an integration to Heroku Pipelines and offers an automatic status for new Heroku builds (or tags). Coralogix presents the suspicious and common errors introduced since that build as well as the alerts and anomalies which might be related to the new version release, thus allowing you to pinpoint issues to particular versions and points in time.

Heroku builds status coralogix

new logs created since heroku build

After 5 days of learning, Coralogix will start to send a daily report highlighting new anomalies (errors and critical messages), further helping you identify new errors, and not constantly bombard you with logging noise.

In addition to the machine learning aspect, Coralogix offers the entire set of Logging capabilities, including Log Query, a centralized live tail, user defined alerts to email or slack, and a fully hosted Kibana for data slicing and dashboards creation.

Making Logging Personal

I started this post by highlighting best practices for logging, and many of them focus on making logs work for you and your team. By compelling you to use their own internal solutions, many of the cloud providers can make your logging experience impersonal and unfocused. Whilst logging with Heroku can take extra steps, to begin with, when you get the setup honed, it can result in a logging system that truly works the way you want and helps you see the issues that matter.

Provision the Coralogix addon and start enjoying 3rd generation Log analytics