Python Logging – The Ultimate Guide

Python Logging Tutorial

This guide is focused on how to log in Python using the built-in support for logging. It introduces various concepts that are relevant to understanding Python logging, discusses the corresponding logging APIs in Python and how to use them, and presents best practices and performance considerations for using these APIs.

We will introduce various concepts relevant to understanding logging in Python, discuss the corresponding logging APIs in Python and how to use them, and present best practices and performance considerations for using these APIs.

This article assumes the reader has a good grasp of programming in Python; specifically, concepts and constructs pertaining to general programming and object-oriented programming. The information and examples in this article are based on Python version 3.8.

Python Logging Concepts and API

Python has offered built-in support for logging since version 2.3. This support includes library APIs for common concepts and tasks that are specific to logging and language-agnostic. This article introduces these concepts and tasks as realized and supported in Python’s logging library.

Basic Python Logging Concepts

When we use a logging library, we perform/trigger the following common tasks while using the associated concepts (highlighted in bold).

  1. A client issues a log request by executing a logging statement. Often, such logging statements invoke a function/method in the logging (library) API by providing the log data and the logging level as arguments. The logging level specifies the importance of the log request. Log data is often a log message, which is a string, along with some extra data to be logged. Often, the logging API is exposed via logger objects.
  2. To enable the processing of a request as it threads through the logging library, the logging library creates a log record that represents the log request and captures the corresponding log data.
  3. Based on how the logging library is configured (via a logging configuration), the logging library filters the log requests/records. This filtering involves comparing the requested logging level to the threshold logging level and passing the log records through user-provided filters.
  4. Handlers process the filtered log records to either store the log data (e.g., write the log data into a file) or perform other actions involving the log data (e.g., send an email with the log data). In some logging libraries, before processing log records, a handler may again filter the log records based on the handler’s logging level and user-provided handler-specific filters. Also, when needed, handlers often rely on user-provided formatters to format log records into strings, i.e., log entries.

Independent of the logging library, the above tasks are performed in an order similar to that shown in Figure 1.

image2

Figure 1: The flow of tasks when logging via a logging library

The Python Logging Library

Python’s standard library offers support for logging via logging, logging.config, and logging.handlers modules.

  • logging module provides the primary client-facing API.
  • logging.config module provides the API to configure logging in a client.
  • logging.handlers module provides different handlers that cover common ways of processing and storing log records.

We collectively refer to these modules as Python’s logging library.

These modules realize the concepts introduced in the previous section as classes, a set of module-level functions, or a set of constants. Figure 2 shows these classes and the associations between them.

image1

Figure 2: Python classes and constants representing various logging concepts

Logging Level

Out of the box, the Python logging library supports five logging levels: critical, error, warning, info, and debug. These levels are denoted by constants with the same name in the logging module, i.e., logging.CRITICAL, logging.ERROR, logging.WARNING, logging.INFO, and logging.DEBUG. The values of these constants are 50, 40, 30, 20, and 10, respectively.

At runtime, the numeric value of a logging level determines the meaning of a logging level. Consequently, clients can introduce new logging levels by using numeric values that are greater than 0 and not equal to pre-defined logging levels as logging levels.

Logging levels can have names. When names are available, logging levels appear by their names in log entries. Every pre-defined logging level has the same name as the name of the corresponding constant; hence, they appear by their names in log entries, e.g., logging.WARNING and 30 levels appear as ‘WARNING’. In contrast, custom logging levels are unnamed by default. So, an unnamed custom logging level with numeric value n appears as ‘Level n’ in log entries, and this results in inconsistent and human-unfriendly log entries. To address this, clients can name a custom logging level using the module-level function logging.addLevelName(level, levelName). For example, by using logging.addLevelName(33, 'CUSTOM1'), level 33 will be recorded as ‘CUSTOM1’.

The Python logging library adopts the community-wide applicability rules for logging levels, i.e., when should logging level X be used?

  1. Use logging.DEBUG to log detailed information, typically of interest only when diagnosing problems, e.g., when the app starts.
  2. Use logging.INFO to confirm the software is working as expected, e.g., when the app initializes successfully.
  3. Use logging.WARNING to report behaviors that are unexpected or are indicative of future problems but do not affect the current functioning of the software, e.g., when the app detects low memory, and this could affect the future performance of the app.
  4. Use logging.ERROR to report the software has failed to perform some function, e.g., when the app fails to save the data due to insufficient permission.
  5. Use logging.CRITICAL to report serious errors that may prevent the continued execution of the software, e.g., when the app fails to allocate memory.

Loggers

The logging.Logger objects offer the primary interface to the logging library. These objects provide the logging methods to issue log requests along with the methods to query and modify their state. From here on out, we will refer to Logger objects as loggers.

Creation The factory function logging.getLogger(name) is typically used to create loggers. By using the factory function, clients can rely on the library to manage loggers and to access loggers via their names instead of storing and passing references to loggers.

The name argument in the factory function is typically a dot-separated hierarchical name, e.g., a.b.c. This naming convention enables the library to maintain a hierarchy of loggers. Specifically, when the factory function creates a logger, the library ensures a logger exists for each level of the hierarchy specified by the name and every logger in the hierarchy is linked to its parent and child loggers.

Threshold Logging Level Each logger has a threshold logging level that determines if a log request should be processed. A logger processes a log request if the numeric value of the requested logging level is greater than or equal to the numeric value of the logger’s threshold logging level. Clients can retrieve and change the threshold logging level of a logger via Logger.getEffectiveLevel() and Logger.setLevel(level) methods, respectively.

When the factory function is used to create a logger, the function sets a logger’s threshold logging level to the threshold logging level of its parent logger as determined by its name.

Logging Methods Every logger offers the following logging methods to issue log requests.

Each of these methods is a shorthand to issue log requests with corresponding pre-defined logging level as the requested logging level.

In addition to the above methods, loggers also offer the following two methods:

  • Logger.log(level, msg, *args, **kwargs) issues log requests with explicitly specified logging levels. This method is useful when using custom logging levels.
  • Logger.exception(msg, *args, **kwargs) issues log requests with the logging level ERROR and that capture the current exception as part of the log entries. Consequently, clients should invoke this method only from an exception handler.

msg and args arguments in the above methods are combined to create log messages captured by log entries. All of the above methods support the keyword argument exc_info to add exception information to log entries and stack_info and stacklevel to add call stack information to log entries. Also, they support the keyword argument extra, which is a dictionary, to pass values relevant to filters, handlers, and formatters.

When executed, the above methods perform/trigger all of the tasks shown in Figure 1 and the following two tasks:

  1. After deciding to process a log request based on its logging level and the threshold logging level, the logger creates a LogRecord object to represent the log request in the downstream processing of the request. LogRecord objects capture the msg and args arguments of logging methods and the exception and call stack information along with source code information. They also capture the keys and values in the extra argument of the logging method as fields.
  2. After every handler of a logger has processed a log request, the handlers of its ancestor loggers process the request (in the order they are encountered walking up the logger hierarchy). The Logger.propagate field controls this aspect, which is True by default.

Beyond logging levels, filters provide a finer means to filter log requests based on the information in a log record, e.g., ignore log requests issued in a specific class. Clients can add and remove filters to/from loggers using Logger.addFilter(filter) and Logger.removeFilter(filter) methods, respectively.

Filters

Any function or callable that accepts a log record argument and returns zero to reject the record and a non-zero value to admit the record can serve as a filter. Any object that offers a method with the signature filter(record: LogRecord) -> int can also serve as a filter.

A subclass of logging.Filter(name: str) that optionally overrides the logging.Filter.filter(record) method can also serve as a filter. Without overriding the filter method, such a filter will admit records emitted by loggers that have the same name as the filter and are children of the filter (based on the name of the loggers and the filter). If the name of the filter is empty, then the filter admits all records. If the method is overridden, then it should return zero value to reject the record and non-zero value to admit the record.

Handlers

The logging.Handler objects perform the final processing of log records, i.e., logging log requests. This final processing often translates into storing the log record, e.g., writing it into system logs or files. It can also translate it to communicate the log record data to specific entities (e.g., send an email) or passing the log record to other entities for further processing (e.g., provide the log record to a log collection process or a log collection service).

Like loggers, handlers have a threshold logging level, which can be set via theHandler.setLevel(level) method. They also support filters via Handler.addFilter(filter) and Handler.removeFilter(filter) methods.

The handlers use their threshold logging level and filters to filter log records for processing. This additional filtering allows context-specific control over logging, e.g., a notifying handler should only process log requests that are critical or from a flaky module.

While processing the log records, handlers format log records into log entries using their formatters. Clients can set the formatter for a handler via Handler.setFormatter(formatter) method. If a handler does not have a formatter, then it uses the default formatter provided by the library.

The logging.handler module provides a rich collection of 15 useful handlers that cover many common use cases (including the ones mentioned above). So, instantiating and configuring these handlers suffices in many situations.

In situations that warrant custom handlers, developers can extend the Handler class or one of the pre-defined Handler classes by implementing the Handler.emit(record) method to log the provided log record.

Formatters

The handlers use logging.Formatter objects to format a log record into a string-based log entry.

Note: Formatters do not control the creation of log messages.

A formatter works by combining the fields/data in a log record with the user-specified format string.

Unlike handlers, the logging library only provides a basic formatter that logs the requested logging level, the logger’s name, and the log message. So, beyond simple use cases, clients need to create new formatters by creating logging.Formatter objects with the necessary format strings.

Formatters support three styles of format strings:

printf, e.g., ‘%(levelname)s:%(name)s:%(message)s’

str.format(), e.g., ‘{levelname}:{name}:{message}’

str.template, e.g., ‘$levelname:$name:$message’

The format string of a formatter can refer to any field of LogRecord objects, including the fields based on the keys of the extra argument of the logging method.

Before formatting a log record, the formatter uses the LogRecord.getMessage() method to construct the log message by combining the msg and args arguments of the logging method (stored in the log record) using the string formatting operator (%). The formatter then combines the resulting log message with the data in the log record using the specified format string to create the log entry.

Python Logging Module

To maintain a hierarchy of loggers, when a client uses the logging library, the library creates a root logger that serves as the root of the hierarchy of loggers. The default threshold logging level of the root logger is logging.WARNING.

The module offers all of the logging methods offered by the Logger class as module-level functions with identical names and signature, e.g., logging.debug(msg, *args, **kwargs). Clients can use these functions to issue log requests without creating loggers, and the root logger services these requests. If the root logger has no handlers when servicing log requests issued via these methods, then the logging library adds a logging.StreamHandler instance based on the sys.stderr stream as a handler to the root logger.

When loggers without handlers receive log requests, the logging library directs such log requests to the last resort handler, which is a logging.StreamHandler instance based on sys.stderr stream. This handler is accessible via the logging.lastResort attribute.

Using Python’s Logging Library

Here are a few code snippets illustrating how to use the Python logging library.

Snippet 1: Creating a logger with a handler and a formatter

# main.py
import logging, sys

def _init_logger():
    logger = logging.getLogger('app')  #1
    logger.setLevel(logging.INFO)  #2
    handler = logging.StreamHandler(sys.stderr)  #3
    handler.setLevel(logging.INFO)  #4
    formatter = logging.Formatter(  
           '%(created)f:%(levelname)s:%(name)s:%(module)s:%(message)s') #5
    handler.setFormatter(formatter)  #6
    logger.addHandler(handler)  #7

_init_logger()
_logger = logging.getLogger('app')

This snippet does the following.

  1. Create a logger named ‘app’.
  2. Set the threshold logging level of the logger to INFO.
  3. Create a stream-based handler that writes the log entries into the standard error stream.
  4. Set the threshold logging level of the handler to INFO.
  5. Create a formatter to capture
    • the time of the log request as the number of seconds since epoch,
    • the logging level of the request,
    • the logger’s name,
    • the name of the module issuing the log request, and
    • the log message.
  6. Set the created formatter as the formatter of the handler.
  7. Add the created handler to this logger.

By changing the handler created in step 3, we can redirect the log entries to different locations or processors.

Snippet 2: Issuing log requests

# main.py
_logger.info('App started in %s', os.getcwd())

This snippet logs informational messages stating the app has started.

When the app is started in the folder /home/kali with the logger created using snippet 1, this snippet will generate the log entry 1586147623.484407:INFO:app:main:App started in /home/kali/ in standard error stream.

Snippet 3: Issuing log requests

# app/io.py
import logging

def _init_logger():
    logger = logging.getLogger('app.io')
    logger.setLevel(logging.INFO)  

_init_logger()
_logger = logging.getLogger('app.io')

def write_data(file_name, data):
    try:
        # write data
        _logger.info('Successfully wrote %d bytes into %s', len(data), file_name)
    except FileNotFoundError:
        _logger.exception('Failed to write data into %s', file_name)

This snippet logs an informational message every time data is written successfully via write_data. If a write fails, then the snippet logs an error message that includes the stack trace in which the exception occurred.

With the logger created using snippet 1, if the execution of app.write_data('/tmp/tmp_data.txt', image_data) succeeds, then this snippet will generate a log entry similar to 1586149091.005398:INFO:app.io:io:Successfully wrote 134 bytes into /tmp/tmp_data.txt. If the execution of app.write_data('/tmp/tmp_data.txt', image_data) fails, then this snippet will generate the following log entry:

1586149219.893821:ERROR:app:io:Failed to write data into /tmp1/tmp_data.txt
Traceback (most recent call last):
  File "/home/kali/program/app/io.py", line 12, in write_data
    print(open(file_name), data)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp1/tmp_data.txt'

Instead of using positional arguments in the format string in the logging method, we could achieve the same output by using the arguments via their names as follows:

_logger.info('Successfully wrote %(data_size)s bytes into %(file_name)s',
    {'data_size': len(data), 'file_name': file_name})

 

Snippet 4: Filtering log requests

# main.py
import logging, os, sys
import app.io

def _init_logger():
    logger = logging.getLogger('app')
    logger.setLevel(logging.INFO)  
    formatter = logging.Formatter(  
        '%(created)f:%(levelname)s:%(name)s:%(module)s:%(message)s')
    handler = logging.StreamHandler(sys.stderr)
    handler.setLevel(logging.INFO)  
    handler.setFormatter(formatter) 
    handler.addFilter(lambda record: record.version > 5 or #1
            record.levelno >= logging.ERROR)               #1
    logger.addHandler(handler) 

_init_logger()
_logger = logging.LoggerAdapter(logging.getLogger('app'), {'version': 6})  #2

This snippet modifies Snippet 1 as follows.

  1. Lines marked #1 add a filter to the handler. This filter admits log records only if their logging level is greater than or equal to logging.ERROR or they are from a component whose version is higher than 4.
  2. Line marked #2 wraps the logger in a logging.LoggerAdapter object to inject version information into log records.

The logging.LoggerAdapter class provides a mechanism to inject contextual information into log records. We discuss other mechanisms to inject contextual information in the Good Practices and Gotchas section.

# app/io.py
import logging

def _init_logger():
    logger = logging.getLogger('app.io')
    logger.setLevel(logging.INFO)  

_init_logger()
_logger = logging.LoggerAdapter(logging.getLogger('app.io'), {'version': 3})  # 1

def write_data(file_name, data):
    try:
        # write data
        _logger.info('Successfully wrote %d bytes into %s', len(data),
            file_name)
    except FileNotFoundError:
        _logger.exception('Failed to write data into %s', file_name)

This snippet modifies Snippet 3 by wrapping the logger in a LoggerAdapter object to inject version information.

All of the above changes affect the logging behavior of the app described in Snippet 2 and Snippet 3 as follows.

  1. The request to log the informational message about the start of the app is processed as the version info supplied by the module satisfies the filter.
  2. The request to log the informational message about the successful write is ignored as the version info supplied by the module fails to satisfy the filter.
  3. The request to log the error message about the failure to write data is processed as the logging level of the message satisfies the filter.

What do you suppose would have happened if the filter was added to the logger instead of the handler? See Gotchas for the answer.

Python Logging Configuration

The logging classes introduced in the previous section provide methods to configure their instances and, consequently, customize the use of the logging library. Snippet 1 demonstrates how to use configuration methods. These methods are best used in simple single-file programs.

When involved programs (e.g., apps, libraries) use the logging library, a better option is to externalize the configuration of the logging library. Such externalization allows users to customize certain facets of logging in a program (e.g., specify the location of log files, use custom loggers/handlers/formatters/filters) and, hence, ease the deployment and use of the program. We refer to this approach to configuration as data-based approach.

Configuring the Library

Clients can configure the logging library by invoking logging.config.dictConfig(config: Dict) function. The config argument is a dictionary and the following optional keys can be used to specify a configuration.

filters key maps to a dictionary of strings and dictionaries. The strings serve as filter ids used to refer to filters in the configuration (e.g., adding a filter to a logger) while the mapped dictionaries serve as filter configurations. The string value of the name key in filter configurations is used to construct logging.Filter instances.

"filters": {
  "io_filter": {
    "name": "app.io"
  }
}

This configuration snippet results in the creation of a filter that admits all records created by the logger named ‘app.io’ or its descendants.

formatters key maps to a dictionary of strings and dictionaries. The strings serve as formatter ids used to refer to formatters in the configuration (e.g., adding a formatter to a handler) while the mapped dictionaries serve as formatter configurations. The string values of the datefmt and format keys in formatter configurations are used as the date and log entry formatting strings, respectively, to construct logging.Formatter instances. The boolean value of the (optional) validate key controls the validation of the format strings during the construction of a formatter.

"formatters": {
  "simple": {
    "format": "%(asctime)s - %(message)s",
    "datefmt": "%y%j-%H%M%S"

  },
  "detailed": {
    "format": "%(asctime)s - %(pathname):%(lineno) - %(message)s"
  }
}

This configuration snippet results in the creation of two formatters. A simple formatter with the specified log entry and date formatting strings and detailed formatter with specified log entry formatting string and default date formatting string.

handlers key maps to a dictionary of strings and dictionaries. The strings serve as handler ids used to refer to handlers in the configuration (e.g., adding a handler to a logger) while the mapped dictionaries serve as handler configurations. The string value of the class key in a handler configuration names the class to instantiate to construct a handler. The string value of the (optional) level key specifies the logging level of the instantiated handler. The string value of the (optional) formatter key specifies the id of the formatter of the handler. Likewise, the list of values of the (optional) filters key specifies the ids of the filters of the handler. The remaining keys are passed as keyword arguments to the handler’s constructor.

"handlers": {
  "stderr": {
    "class": "logging.StreamHandler",
    "level": "INFO",
    "filters": ["io_filter"],
    "formatter": "simple",
    "stream": "ext://sys.stderr"
  },
  "alert": {
    "class": "logging.handlers.SMTPHandler",
    "level": "ERROR",
    "formatter": "detailed",
    "mailhost": "smtp.skynet.com",
    "fromaddr": "[email protected]",
    "toaddrs": [ "[email protected]", "[email protected]" ],
    "subject": "System Alert"
  }
}

This configuration snippet results in the creation of two handlers:

  • A stderr handler that formats log requests with INFO and higher logging level log via the simple formatter and emits the resulting log entry into the standard error stream. The stream key is passed as keyword arguments to logging.StreamHandler constructor.
    The value of the stream key illustrates how to access objects external to the configuration. The ext:// prefixed string refers to the object that is accessible when the string without the ext:// prefix (i.e., sys.stderr) is processed via the normal importing mechanism. Refer to Access to external objects for more details. Refer to Access to internal objects for details about a similar mechanism based on cfg:// prefix to refer to objects internal to a configuration.
  • An alert handler that formats ERROR and CRITICAL log requests via the detailed formatter and emails the resulting log entry to the given email addresses. The keys mailhost, formaddr, toaddrs, and subject are passed as keyword arguments to logging.handlers.SMTPHandler’s constructor.

loggers key maps to a dictionary of strings that serve as logger names and dictionaries that serve as logger configurations. The string value of the (optional) level key specifies the logging level of the logger. The boolean value of the (optional) propagate key specifies the propagation setting of the logger. The list of values of the (optional) filters key specifies the ids of the filters of the logger. Likewise, the list of values of the (optional) handlers key specifies the ids of the handlers of the logger.

"loggers": {
  "app": {
    "handlers": ["stderr", "alert"],
    "level": "WARNING"
  },
  "app.io": {
    "level": "INFO"
  }
}

This configuration snippet results in the creation of two loggers. The first logger is named app, its threshold logging level is set to WARNING, and it is configured to forward log requests to stderr and alert handlers. The second logger is named app.io, and its threshold logging level is set to INFO. Since a log request is propagated to the handlers associated with every ascendant logger, every log request with INFO or a higher logging level made via the app.io logger will be propagated to and handled by both stderr and alert handlers.

root key maps to a dictionary of configuration for the root logger. The format of the mapped dictionary is the same as the mapped dictionary for a logger.

incremental key maps to either True or False (default). If True, then only logging levels and propagate options of loggers, handlers, and root loggers are processed, and all other bits of the configuration is ignored. This key is useful to alter existing logging configuration. Refer to Incremental Configuration for more details.

disable_existing_loggers key maps to either True (default) or False. If False, then all existing non-root loggers are disabled as a result of processing this configuration.

Also, the config argument should map the version key to 1.

Here’s the complete configuration composed of the above snippets.

{
  "version": 1,
  "filters": {
    "io_filter": {
      "name": "app.io"
    }
  },
  "formatters": {
    "simple": {
      "format": "%(asctime)s - %(message)s",
      "datefmt": "%y%j-%H%M%S"

    },
    "detailed": {
      "format": "%(asctime)s - %(pathname):%(lineno) - %(message)s"
    }
  },
  "handlers": {
    "stderr": {
      "class": "logging.StreamHandler",
      "level": "INFO",
      "filters": ["io_filter"],
      "formatter": "simple",
      "stream": "ext://sys.stderr"
    },
    "alert": {
      "class": "logging.handlers.SMTPHandler",
      "level": "ERROR",
      "formatter": "detailed",
      "mailhost": "smtp.skynet.com",
      "fromaddr": "[email protected]",
      "toaddrs": [ "[email protected]", "[email protected]" ],
      "subject": "System Alert"
    }
  },
  "loggers": {
    "app": {
      "handlers": ["stderr", "alert"],
      "level": "WARNING"
    },
    "app.io": {
      "level": "INFO"
    }
  }
}

Customizing via Factory Functions

The configuration schema for filters supports a pattern to specify a factory function to create a filter. In this pattern, a filter configuration maps the () key to the fully qualified name of a filter creating factory function along with a set of keys and values to be passed as keyword arguments to the factory function. In addition, attributes and values can be added to custom filters by mapping the . key to a dictionary of attribute names and values.

For example, the below configuration will cause the invocation of app.logging.customFilterFactory(startTime='6PM', endTime='6AM') to create a custom filter and the addition of local attribute with the value True to this filter.

  "filters": {
    "time_filter": {
      "()": "app.logging.create_custom_factory",
      "startTime": "6PM",
      "endTime": "6PM",
      ".": {
        "local": true
      }
    }
  }

Configuration schemas for formatters, handlers, and loggers also support the above pattern. In the case of handlers/loggers, if this pattern and the class key occur in the configuration dictionary, then this pattern is used to create handlers/loggers. Refer to User-defined Objects for more details.

Configuring using Configparse-Format Files

The logging library also supports loading configuration from a configparser-format file via the <a href="https://docs.python.org/3/library/logging.config.html#logging.config.fileConfig" target="_blank" rel="noopener noreferrer">logging.config.fileConfig() function. Since this is an older API that does not provide all of the functionalities offered by the dictionary-based configuration scheme, the use of the dictConfig() function is recommended; hence, we’re not discussing the fileConfig() function and the configparser file format in this tutorial.

Configuring Over The Wire

While the above APIs can be used to update the logging configuration when the client is running (e.g., web services), programming such update mechanisms from scratch can be cumbersome. The logging.config.listen() function alleviates this issue. This function starts a socket server that accepts new configurations over the wire and loads them via dictConfig() or fileConfig() functions. Refer to logging.config.listen() for more details.

Loading and Storing Configuration

Since the configuration provided to dictConfig() is nothing but a collection of nested dictionaries, a logging configuration can be easily represented in JSON and YAML format. Consequently, programs can use the json module in Python’s standard library or external YAML processing libraries to read and write logging configurations from files.

For example, the following snippet suffices to load the logging configuration stored in JSON format.

import json, logging.config

with open('logging-config.json', 'rt') as f:
  config = json.load(f)
  logging.config.dictConfig(config)

Limitations

In the supported configuration scheme, we cannot configure filters to filter beyond simple name-based filtering. For example, we cannot create a filter that admits only log requests created between 6 PM and 6 AM. We need to program such filters in Python and add them to loggers and handlers via factory functions or the addFilter() method.

Python Logging Good Practices and Gotchas

In this section, we will list a few good practices and gotchas related to the logging library. This list stems from our experience, and we intend it to complement the extensive information available in the Logging HOWTO and Logging Cookbook sections of Python’s documentation.

Since there are no silver bullets, all good practices and gotchas have exceptions that are almost always contextual. So, before using the following good practices and gotchas, consider their applicability in the context of your application and ascertain whether they are appropriate in your application’s context.

Good Practices

Create loggers using the logging.getLogger() factory function

The factory function helps the library manage the mapping from logger names to logger instances and maintain a hierarchy of loggers. In turn, this mapping and hierarchy offer the following benefits:

  1. Clients can use the factory function to access the same logger in different parts of the program by merely retrieving the logger by its name.
  2. Only a finite number of loggers are created at runtime (under normal circumstances).
  3. Log requests can be propagated up the logger hierarchy.
  4. When unspecified, the threshold logging level of a logger can be inferred from its ascendants.
  5. The configuration of the logging library can be updated at runtime by merely relying on the logger names.

Use the logging.<logging level>() function or the Logger.<logging level>()  methods to log at pre-defined logging levels

Besides making the code a bit shorter, the use of these functions/methods helps partition the logging statements in a program into two sets:

  1. Those that issue log requests with pre-defined logging levels
  2. Those that issue log requests with custom logging levels.

Use pre-defined logging levels

As described in the Logging Level section in the Concepts and API chapter, the pre-defined logging levels offered by the library capture almost all logging scenarios that occur in programs. Further, since most developers are familiar with pre-defined logging levels (as most logging libraries across different programming languages offer very similar levels), the use of pre-defined levels can help lower deployment, configuration, and maintenance burden. So, unless required, use pre-defined logging levels.

Create module-level loggers

While creating loggers, we can create a logger for each class or create a logger for each module. While the first option enables fine-grained configuration, it leads to more loggers in a program, i.e., one per class. In contrast, the second option can help reduce the number of loggers in a program. So, unless such fine-grained configuration is necessary, create module-level loggers.

Name module-level loggers with the name of the corresponding modules.

Since the logger names are string values that are not part of the namespace of a Python program, they will not clash with module names. Hence, use the name of a module as the name of the corresponding module-level logger. With this naming, logger naming piggybacks on the dot notation based module naming and, consequently, simplifies referring to loggers.

Use logging.LoggerAdatper to inject local contextual information

As demonstrated in Snippet 4, we can use logging.LoggerAdapter to inject contextual information into log records. LoggerAdapter can also be used to modify the log message and the log data provided as part of a log request.

Since the logging library does not manage these adapters, they cannot be accessed via common names. For this reason, use them to inject contextual information that is local to a module or a class.

Use filters or logging.setLogRecordFactory() to inject global contextual information

There are two options to seamlessly inject global contextual information (that is common across an app) into log records.

The first option is to use the filter support to modify the log record arguments provided to filters. For example, the following filter injects version information into incoming log records.

def version_injecting_filter(logRecord):
    logRecord.version = '3'
    return True

There are two downsides to this option. First, if filters depend on the data in log records, then filters that inject data into log records should be executed before filters that use the injected data. Hence, the order of filters added to loggers and handlers becomes crucial. Second, the option “abuses” the support to filter log records to extend log records.

The second option is to initialize the logging library with a log record creating factory function via logging.setLogRecordFactory(). Since the injected contextual information is global, it can be injected into log records when they are created in the factory function and be sure the data will be available to every filter, formatter, logger, and handler in the program.

The downside of this option is that we have to ensure factory functions contributed by different components in a program play nicely with each other. While log record factory functions could be chained, such chaining increases the complexity of programs.

Use the data-based approach to configure the logging library

If your program involves multiple modules and possibly third-party components, then use the data-based approach described in the Configuration chapter to configure the logging library.

Attach common handlers to the loggers higher up the logger hierarchy

If a handler is common to two loggers of which one is the descendant of the other, then attach the handler to the ascendant logger and rely on the logging library to propagate the log requests from the descendant logger to the handlers of the ascendant logger. If the propagate attribute of loggers has not been modified, this pattern helps avoid duplicate messages.

Use logging.disable() function to inhibit the processing of log requests below a certain logging level across all loggers

A logger processes a log request if the logging level of the log request is at least as high as the logger’s effective logging level. A logger’s effective logging level is the higher of two logging levels: the logger’s threshold logging level and the library-wide logging level. We can set the library-wide logging level via the logging.disable(level) function. By default, the library-wide logging level is 0, i.e., log requests of every logging level will be processed.

Using this function, we can throttle the logging output of an app by increasing the logging level across the entire app.

What about caching references to loggers?

Before moving on to gotchas, let’s consider the goodness of the common practice of caching references to loggers and accessing loggers via cached references, e.g., this is how the _logger attribute was used in the previous code snippets.

This coding pattern avoids repeated invocations of the logging.getLogger() function to retrieve the same module-level logger; hence, it helps eliminate redundant retrievals. However, such eliminations can lead to lost log requests if retrievals are not redundant. For example, suppose the logging library configuration in a long-running web service is updated with this disabling_existing_loggers option. Since such an update would disable cached loggers, none of the logging statements that use cached loggers would log any requests. While we can remedy this situation by updating cached references to loggers, a simpler solution would be to use the logging.getLogger() function instead of caching references.

In short, caching references to loggers is not always a good practice. So, consider the context of the program while deciding to cache references to loggers.

Gotchas

Filters fail

When the logging library invokes the filters associated with handlers and loggers, the library assumes the filters will always execute to completion, i.e., not fail on errors. So, there is no error handling logic in the library to deal with failing filters. Consequently, when a filter fails (to execute to completion), the corresponding log request will not be logged.

Ensure filters will execute to completion. More so, when using custom filters and using additional data in filtering.

Formatters fail

The logging library makes a similar assumption about formatters, i.e., formatters will always execute to completion. Consequently, when a formatter fails to execute to completion, the corresponding log request will not be logged.

Ensure formatters will execute to completion.

Required keys are missing is the extra argument

If the filters/formatters refer to keys of the extra argument provided as part of logging methods, then the filters/formatters can fail when the extra argument does not provide a referred key.

Ensure every key of the extra argument used in a filter or a formatter are available in every triggering logging statement.

Keys in the extra argument clash with required attributes

The logging library adds the keys of the extra argument (to various logging methods) as attributes to log records. However, if asctime and message occur as keys in the extra argument, then the creation of log records will fail, and the corresponding log request will not be logged.

Similar failure occurs if args, exc_info, lineno, msg, name, or pathname occur as keys in the extra argument; these are attributes of the LogRecord class.

Ensure asctime, message, and certain attributes of LogRecord do not appear as keys in the extra argument of logging methods.

Libraries using custom logging levels are combined

When a program and its dependent libraries use the logging library, their logging requirements are combined by the underlying logging library that services these requirements. In this case, if the components of a program use custom logging levels that are mutually inconsistent, then the logging outcome can be unpredictable.

Don’t use custom logging levels, specifically, in libraries.

Filters of ancestor loggers do not fire

By default, log requests are propagated up the logger hierarchy to be processed by the handlers of ancestor loggers. While the filters of the handlers process such log requests, the filters of the corresponding loggers do not process such log requests.

To apply a filter to all log requests submitted to a logger, add the filter to the logger.

Ids of handlers/filters/formatters clash

If multiple handlers share the same handler id in a configuration, then the handler id refers to the handler that is created last when the configuration is processed. The same happens amongst filters and formatters that share ids.

When a client terminates, the logging library will execute the cleanup logic of the handler associated with each handler id. So, if multiple handlers have the same id in a configuration, then the cleanup logic of all but the handler created last will not be executed and, hence, result in resource leaks.

Use unique ids for objects of a kind in a configuration.

Python Logging Performance Considerations

While logging statements help capture information at locations in a program, they contribute to the cost of the program in terms of execution time (e.g., logging statements in loops) and storage (e.g., logging lots of data). Although cost-free yet useful logging is impossible, we can reduce the cost of logging by making choices that are informed by performance considerations.

Configuration-Based Considerations

After adding logging statements to a program, we can use the support to configure logging (described earlier) to control the execution of logging statements and the associated execution time. In particular, consider the following configuration capabilities when making decisions about logging-related performance.

  1. Change logging levels of loggers: This change helps suppress log messages below a certain log level. This helps reduce the execution cost associated with unnecessary creation of log records.
  2. Change handlers: This change helps replace slower handlers with faster handlers (e.g., during testing, use a transient handler instead of a persistent handler) and even remove context-irrelevant handlers. This reduces the execution cost associated with unnecessary handling of log records.
  3. Change format: This change helps exclude unnecessary parts of a log record from the log (e.g., exclude IP addresses when executing in a single node setting). This reduces the execution cost associated with unnecessary handling of parts of log records.

The above changes the range over coarser to finer aspects of logging support in Python.

Code-Based Considerations

While the support to configure logging is powerful, it cannot help control the performance impact of implementation choices baked into the source code. Here are a few such logging-related implementation choices and the reasons why you should consider them when making decisions about logging-related performance.

Do not execute inactive logging statements

Upon adding the logging module to Python’s standard library, there were concerns about the execution cost associated with inactive logging statements — logging statements that issue log requests with logging level lower than the threshold logging level of the target logger. For example, how much extra time will a logging statement that invokes logger.debug(...) add to a program’s execution time when the threshold logging level of logger is logging.WARN? This concern led to client-side coding patterns (as shown below) that used the threshold logging level of the target logger to control the execution of the logging statement.

# client code
...
if logger.isEnabledFor(logging.DEBUG):
    logger.debug(msg)
...

Today, this concern is not valid because the logging methods in the logging.Logger class perform similar checks and process the log requests only if the checks pass. For example, as shown below, the above check is performed in the logging.Logger.debug method.

# client code
...
logger.debug(msg)
...

# logging library code

class Logger:
    ...
    def debug(self, msg, *args, **kwargs):
        if self.isEnabledFor(DEBUG):
            self._log(DEBUG, msg, args, **kwargs)

Consequently, inactive logging statements effectively turn into no-op statements and do not contribute to the execution cost of the program.

Even so, one should consider the following two aspects when adding logging statements.

  1. Each invocation of a logging method incurs a small overhead associated with the invocation of the logging method and the check to determine if the logging request should proceed, e.g., a million invocations of logger.debug(...) when threshold logging level of logger was logging.WARN took half a second on a typical laptop. So, while the cost of an inactive logging statement is trivial, the total execution cost of numerous inactive logging statements can quickly add up to be non-trivial.
  2. While disabling a logging statement inhibits the processing of log requests, it does not inhibit the calculation/creation of arguments to the logging statement. So, if such calculations/creations are expensive, then they can contribute non-trivially to the execution cost of the program even when the corresponding logging statement is inactive.

Do not construct log messages eagerly

Clients can construct log messages in two ways: eagerly and lazily.

  1. The client constructs the log message and passes it on to the logging method, e.g., logger.debug(f'Entering method Foo: {x=}, {y=}').
    This approach offers formatting flexibility via f-strings and the format() method, but it involves the eager construction of log messages, i.e., before the logging statements are deemed as active.
  2. The client provides a printf-style message format string (as a msg argument) and the values (as a args argument) to construct the log message to the logging method, e.g., logger.debug('Entering method %s: x=%d, y=%f', 'Foo', x, y). After the logging statement is deemed as active, the logger constructs the log message using the string formatting operator %.
    This approach relies on an older and quirky string formatting feature of Python but it involves the lazy construction of log messages.

While both approaches result in the same outcome, they exhibit different performance characteristics due to the eagerness and laziness of message construction.

For example, on a typical laptop, a million inactive invocations of logger.debug('Test message {0}'.format(t)) takes 2197ms while a million inactive invocations of logger.debug('Test message %s', t) takes 1111ms when t is a list of four integers. In the case of a million active invocations, the first approach takes 11061ms and the second approach took 10149ms. A savings of 9–50% of the time taken for logging!

So, the second (lazy) approach is more performant than the first (eager) approach in cases of both inactive and active logging statements. Further, the gains would be larger when the message construction is non-trivial, e.g., use of many arguments, conversion of complex arguments to strings.

Do not gather unnecessary under-the-hood information

By default, when a log record is created, the following data is captured in the log record:

  1. Identifier of the current process
  2. Identifier and name of the current thread
  3. Name of the current process in the multiprocessing framework
  4. Filename, line number, function name, and call stack info of the logging statement

Unless these bits of data are logged, gathering them unnecessarily increases the execution cost. So, if these bits of data will not be logged, then configure the logging framework to not gather them by setting the following flags.

  1. logging.logProcesses = False
  2. logging.logThreads = False
  3. logging.logMultiProcessing = False
  4. logging._srcFile = None

Do not block the main thread of execution

There are situations where we may want to log data in the main thread of execution without spending almost any time logging the data. Such situations are common in web services, e.g., a request processing thread needs to log incoming web requests without significantly increasing its response time. We can tackle these situations by separating concerns across threads: a client/main thread creates a log record while a logging thread logs the record. Since the task of logging is often slower as it involves slower resources (e.g., secondary storage) or other services (e.g., logging services such as Coralogix, pub-sub systems such as Kafka), this separation of concerns helps minimize the effort of logging on the execution time of the main/client thread.

The Python logging library helps handle such situations via the QueueHandler and QueueListener classes as follows.

  1. A pair of QueueHandler and QueueListener instances are initialized with a queue.
  2. When the QueueHandler instance receives a log record from the client, it merely places the log request in its queue while executing in the client’s thread. Given the simplicity of the task performed by the QueueHandler, the client thread hardly pauses.
  3. When a log record is available in the QueueListener queue, the listener retrieves the log record and executes the handlers registered with the listener to handle the log record. In terms of execution, the listener and the registered handlers execute in a dedicated thread that is different from the client thread.

Note: While QueueListener comes with a default threading strategy, developers are not required to use this strategy to use QueueHandler. Instead, developers can use alternative threading strategies that meet their needs.

That about wraps it up for this Python logging guide. If you’re looking for a log management solution to centralize your Python logs, check out our easy to configure Python integration.

Start solving your production issues faster

Let's talk about how Coralogix can help you better understand your logs

Managed, Scaled and Compliant ELK Stack

No credit card required

Get a personalized demo

Jump on a call with one of our experts and get a live personalized demonstration