Logging in Python

Logging is critical in many kinds of software. For long-running software systems, it enables continuous telemetry and reporting. And for all software, it can provide priceless information for troubleshooting and post-mortems. The bigger the application, the more important logging becomes. But even small scripts can benefit.

Python provides logging through the logging module. In my opinion, this module is one of the more technically impressive parts of Python’s standard library. It’s well-designed, flexible, thread-safe, and richly powerful. It’s also complex, with many moving parts, making it hard to learn well. This chapter gets you over most of that learning curve, so you can fully benefit from what logging has to offer. The payoff is well worth it, and will serve you for years.

Broadly, there are two ways to use logging. One, which I’m calling the basic interface, is appropriate for scripts - meaning, Python programs that are small enough to fit in a single file. For more substantial applications, it’s typically better to use logger objects, which give more flexible, centralized control, and access to logging hierarchies. We’ll start with the former, to introduce key ideas.

The Basic Interface

Here’s the easiest way to use Python’s logging module:

import logging
logging.warning('Look out!')

Save this in a script and run it, and you’ll see this printed out:

WARNING:root:Look out!

You can do useful logging right away, by calling functions in the logging module itself. Notice you invoke logging.warning(), and the output line starts with WARNING. You can also call logging.error(), which gives a different prefix:

ERROR:root:Look out!

We say that warning and error are at different message log levels. You have a spectrum of log levels to choose from, in order of increasing severity:^[30]

debug: Detailed information, typically of interest only when diagnosing problems.
info: Confirmation that things are working as expected.
warning: An indication that something unexpected happened, or indicative of some problem in the near future (e.g. ‘disk space low’). The software is still working as expected.
error: Due to a more serious problem, the software has not been able to perform some function.
critical: A serious error, indicating that the program itself may be unable to continue running.

You use them all just like logging.warning() and logging.error():

logging.debug("Small detail. Useful for troubleshooting.")
logging.info("This is informative.")
logging.warning("This is a warning message.")
logging.error("Uh oh. Something went wrong.")
logging.critical("We have a big problem!")

Each has a corresponding uppercased constant in the library (e.g., logging.WARNING for logging.warning()). You use these when defining the log level threshold. Run the above, and here is the output:

WARNING:root:This is a warning message.
ERROR:root:Uh oh. Something went wrong.
CRITICAL:root:We have a big problem!

Where did the debug and info messages go? As it turns out, the default logging threshold is logging.WARNING, which means only messages of that severity or greater are actually generated; the others are ignored completely. The order matters in the list above; debug is considered strictly less severe than info, and so on. Change the log level threshold using the basicConfig function:

logging.basicConfig(level=logging.INFO)
logging.info("This is informative.")
logging.error("Uh oh. Something went wrong.")

Run this new program, and the INFO message gets printed:

INFO:root:This is informative.
ERROR:root:Uh oh. Something went wrong.

Again, the order is debug(), info(), warning(), error() and critical(), from lowest to highest severity. When we set the log level threshold, we declare that we only want to see messages of that level or higher. Messages of a lower level are not printed. When you set level to logging.DEBUG, you see everything; set it to logging.CRITICAL, and you only see critical messages, and so on.

The phrase "log level" means two different things, depending on context. It can mean the severity of a message, which you set by choosing which of the functions to use - logging.warning(), etc. Or it can mean the threshold for ignoring messages, which is signaled by the constants: logging.WARNING, etc.

You can also use the constants in the more general logging.log function - for example, a debug message:

logging.log(logging.DEBUG,
    "Small detail. Useful for troubleshooting.")
logging.log(logging.INFO, "This is informative.")
logging.log(logging.WARNING, "This is a warning message.")
logging.log(logging.ERROR, "Uh oh. Something went wrong.")
logging.log(logging.CRITICAL, "We have a big problem!")

This lets you control the log level dynamically, at runtime. For example:

def log_results(message, level=logging.INFO):
    logging.log(level, "Results: " + message)

Why do we have log levels?

If you haven’t worked with similar logging systems before, you may wonder why we have different log levels, and why you’d want to control the filtering threshold. It’s easiest to see this if you’ve written Python scripts that include a number of print() statements - including some useful for diagnosis when something goes wrong, but a distraction when everything is working fine.

The fact is, some of those print() statements are more important than others. Some indicate mission-critical problems you always want to know about - possibly to the point of waking up an engineer, so they can deploy a fix immediately. Some are important, but can wait until the next work day - and you definitely do NOT want to wake anyone up for that. Some are details which may have been important in the past, and might be in the future, so you don’t want to remove them; in the meantime, they are just line noise.

Having log levels solves all these problems. As you develop and evolve your code over time, you continually add new logging statements of the appropriate severity. You now even have the freedom to be proactive. With "logging" via print(), each log statement has a cost - certainly in signal-to-noise ratio, and also potentially in performance. So you might debate whether to include that print statement at all. But with logging, you can insert info messages, for example, to log certain events occurring as they should. In development, those INFO messages can be very useful to verify certain things are happening, so you can modify the log level to produce them. On production, you may not want to have them cluttering up the logs, so you just set the threshold higher. Or if you are doing some kind of monitoring on production, and temporarily need that information, you can adjust the log level threshold to output those messages; when you are finished, you can adjust it back to exclude them again.

When troubleshooting, you can liberally introduce debug-level statements to provide extra detailed statements. When done, you can just adjust the log level to turn them off. You can leave them in the code without cost, eliminating any risk of introducing more bugs when you go through and remove them. This also leaves them available if they are needed in the future.

The log level symbols are actually set to integers. You can theoretically use these numbers instead, or even define your own log levels that are (for example) a third of the way between WARNING and ERROR. In normal practice, it’s best to use the predefined logging levels. Doing otherwise makes your code harder to read and maintain, and isn’t worthwhile unless you have a compelling reason.

For reference, the numbers are 50 for CRITICAL, 40 for ERROR, 30 for WARNING, 20 for INFO, and 10 for DEBUG. So when you set the log level threshold, it’s actually setting a number. The only log messages emitted are those with a level greater than or equal to that number.

Configuring The Basic Interface

You saw above you can change the loglevel threshold by calling a function called basicConfig:

logging.basicConfig(level=logging.INFO)
logging.debug("You won't see this message!")
logging.error("But you will see this one.")

If you use it at all, basicConfig must be called exactly once, and it must happen before the first logging event. (Meaning, before the first call to debug(), or warning(), etc.) Additionally, if your program has several threads, it must be called from the main thread - and only the main thread. ^[31]

You already met one of the configuration options, level. This is set to the log level threshold, and is one of DEBUG, INFO, WARNING, ERROR, or CRITICAL. Some of the other options include:

filename: Write log messages to the given file, rather than stderr.
filemode: Set to "a" to append to the log file (the default), or "w" to overwrite.
format: The format of log records.
level: The log level threshold, described above.

By default, log messages are written to standard error. You can also write them to a file, one per line, to easily read later. Do this by setting filename to the log file path. By default it appends log messages, meaning that it will only add to the end of the file if it isn’t empty. If you’d rather the file be emptied before the first log message, set filemode to "w". Be careful about doing that, of course, because you can easily lose old log messages if the application restarts:

# Wipes out previous log entries when program restarts
logging.basicConfig(filename="log.txt", filemode="w")
logging.error("oops")

The other valid value is "a", for append - that’s the default, and probably will serve you better in production. "w" can be useful during development, though.

format defines what chunks of information the final log record will include, and how they are laid out. These chunks are called "attributes" in the logging module docs. One of these attributes is the actual log message - the string you pass when you call logging.warning(), and so on. Often you will want to include other attributes as well. Consider the kind of log record we saw above:

WARNING:root:Collision imminent

This record has three attributes, separated by colons. First is the log level name; last is the actual string message you pass when you call logging.warning(). (In the middle is the name of the underlying logger object. basicConfig uses a logger called "root"; we’ll learn more about that later.)

You specify the layout you want by setting format to a string that defines certain named fields, according to percent-style formatting. Three of them are levelname, the log level; message, the message string passed to the logging function; and name, the name of the underlying logger. Here’s an example:

logging.basicConfig(
    format="Log level: %(levelname)s, msg: %(message)s")
logging.warning("Collision imminent")

If you run this as a program, you get the following output:

Log level: WARNING, msg: Collision imminent

It turns out the default formatting string is

%(levelname)s:%(name)s:%(message)s

You indicate named fields in percent-formatting by %(FIELDNAME)X, where "X" is a type code: s for string, d for integer (decimal), and f for floating-point.

Many other attributes are provided, if you want to include them. Here’s a select few from the full list:^[32]

Attribute	Format	Description
asctime	`%(asctime)s`	Human-readable date/time
funcName	`%(funcName)s`	Name of function containing the logging call
lineno	`%(lineno)d`	The line number of the logging call
message	`%(message)s`	The log message
pathname	`%(pathname)s`	Full pathname of the source file of the logging call
levelname	`%(levelname)s`	Text logging level for the message ('DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL')
name	`%(name)s`	The logger’s name

You might be wondering why log record format strings use Python 2’s ancient percent-formatting style, even in modern Python. As it turns out, the conversion was attempted, but backwards-compatibility reasons made percent-formatting the only practical choice for the logging module, even after the Python 3 reboot.

If you want to use the newer string formatting badly enough, there are things you can do - there’s even a standard recipe.^[33] But doing so is complicated enough that it may not be worth the effort, and it won’t help with legacy code. I recommend you simply cooperate with the situation, and use percent formatting with your Python logging.

Passing Arguments

You often want to include some kind of runtime data in the logged message. When you construct the message to log, specify the final log message like this:

num_fruits = 14
fruit_name = "oranges"
logging.info(
    "We ate %d of your %s. Thanks!",
    num_fruits, fruit_name)

The output:

INFO:root:We ate 14 of your oranges. Thanks!

We call info with three parameters. First is the format string; the second and third are arguments. The general form is

logging.info(format, *args)

You can pass zero or more arguments, so long as each has a field in the format string:

# Do it like this:
logging.info("%s, %s, %s, %s, %s, %s and %s",
    "Doc", "Happy", "Sneezy", "Bashful",
    "Dopey", "Sleepy", "Grumpy")

You must resist the obvious temptation to format the string fully, and pass that to the logging function:

num_fruits = 14
fruit_name = "oranges"
logging.warning(
    "Don't do this: %d %s" % (num_fruits, fruit_name))
logging.warning(
    "Or this: {:d} {:s}".format(
    num_fruits, fruit_name))
logging.warning(
    f"And definitely not this: {num_fruits:d} {fruit_name:s}")

All of these work, of course, in the sense that you will get correct log messages. But it surrenders important performance benefits logging normally provides. Remember: when the line of code with the log message is executed, it may not actually trigger a log event. If the log level threshold is higher than the message itself, the line does nothing. In that case, there is no reason to format the string.

In the "do it like this" form, the string is formatted if and only if a log event actually happens, so that’s fine. But if you format the string yourself, it’s always formatted. That takes up system memory and CPU cycles even if no logging takes place. If the code path with the logging call is only executed occasionally, that’s not a big deal. But it impairs the program when a debug message is logged in the middle of a tight loop. When you originally code the line, you never really know where it might migrate in the future, or who will call your function in ways you never imagined.

So just use the supported form, where the first argument is the format string, and subsequent arguments are the parameters for it. You can also use named fields, by passing a dictionary as the second argument - though you must use the percent-formatting named-field format:

fruit_info = {"count": 14, "name": "oranges"}
logging.info(
    "We ate %(count)d of your %(name)s. Thanks!",
    fruit_info)

Beyond Basic: Loggers

The basic interface is simple and easy to set up. It works well in single-file scripts. Larger Python applications tend to have different logging needs, however. logging meets these needs through a richer interface, called logger objects - or simply, loggers.

Actually, you have been using a logger object all along: when you call logging.warning() (or the other log functions), they convey messages through what is called the root logger - the primary, default logger object. This is why the word "root" shows in some example output.

logger.basicConfig operates on this root logger. You can fetch the actual root logger object by calling logging.getLogger:

>>> logger = logging.getLogger()
>>> logger.name
'root'

As you can see, it knows its name is "root". Logger objects have all the same functions (methods, actually) the logging module itself has:

import logging
logger = logging.getLogger()
logger.debug("Small detail. Useful for troubleshooting.")
logger.info("This is informative.")
logger.warning("This is a warning message.")
logger.error("Uh oh. Something went wrong.")
logger.critical("We have a big problem!")

Save this in a file and run it, and you’ll see the following output:

This is a warning message.
Uh oh. Something went wrong.
We have a big problem!

This is different from what we saw with basicConfig, which printed out this instead:

WARNING:root:This is a warning message.
ERROR:root:Uh oh. Something went wrong.
CRITICAL:root:We have a big problem!

At this point, we’ve taken steps backward compared to basicConfig. Not only is the log message unadorned by the log level, or anything else useful. The log level threshold is hard-coded to logging.WARNING, with no way to change it. The logging output will be written to standard error, and no where else, regardless of where you actually need it to go.

Let’s take inventory of what we want to control here. We want to choose our log record format. And further, we want to be able to control the log level threshold, and write messages to different streams and destinations. You do this with a tool called handlers.

Log Destinations: Handlers and Streams

By default, loggers write to standard error. You can select a different destination - or even several destinations - for each log record:

You can write log records to a file. Very common.
You can, while writing records to that file, also parrot it to stderr.
Or to stdout. Or both.
You can simultaneously log messages to two different files.
In fact, you can log (say) INFO and higher messages to one file, and ERROR and higher to another.
You can write log records to a remote log server, accessed via a REST HTTP API.
Mix and match all the above, and more.
And you can set a different, custom log format for each destination.

This is all managed through what are called handlers. In Python logging, a handler’s job is to take a log record, and make sure it gets recorded in the appropriate destination. That destination can be a file; a stream like stderr or stdout; or something more abstract, like inserting into a queue, or transmitting via an RPC or HTTP call.

By default, logger objects don’t have any handlers. You can verify this using the hasHandlers method:

>>> logger = logging.getLogger()
>>> logger.hasHandlers()
False

With no handler, a logger has the following behavior:

Messages are written to stderr.
Only the message is written, nothing else. There’s no way to add fields or otherwise modify it.
The log level threshold is logging.WARNING. There is no way to change that.

To change this, your first step is to create a handler. Nearly all logger objects you ever use will have custom handlers. Let’s see how to create a simple handler that writes messages to a file, called "log.txt".

import logging
logger = logging.getLogger()
log_file_handler = logging.FileHandler("log.txt")
logger.addHandler(log_file_handler)
logger.debug("A little detail")
logger.warning("Boo!")

The logging module provides a class called FileHandler. It takes a file path argument, and will write log records into that file, one per line. When you run this code, log.txt will be created (if it doesn’t already exist), and will contain the string "Boo!" followed by a newline. (If log.txt did exist already, the logged message would be appended to the end of the file.)

But "A little detail" is not written, because it’s below the default logger threshold of WARNING. We change that by calling a method named setLevel on the logger object:

import logging
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
log_file_handler = logging.FileHandler("log.txt")
logger.addHandler(log_file_handler)
logger.debug("A little detail")
logger.warning("Boo!")

This writes the following in "log.txt":

A little detail
Boo!

Confusingly, you can call setLevel on a logger with no handlers, but it has no effect:

# Doing it wrong:
import logging
logger = logging.getLogger()
logger.setLevel(logging.DEBUG) # No effect.
logger.debug("This won't work :(")

To change the threshold from the default of logging.WARNING, you must both add a handler, and change the logger’s level.

What if you want to log to stdout? Do that with a StreamHandler:

import logging
import sys
logger = logging.getLogger()
out_handler = logging.StreamHandler(sys.stdout)
logger.addHandler(out_handler)
logger.warning("Boo!")

If you save this in a file and run it, you’ll get "Boo!" on standard output. Notice that logging.StreamHandler takes sys.stdout as its argument. You can create a StreamHandler without an argument too, in which case it will write its records to standard error:

import logging
logger = logging.getLogger()
# Same as StreamHandler(sys.stderr)
stderr_handler = logging.StreamHandler()
logger.addHandler(stderr_handler)
logger.warning("This goes to standard error")

In fact, you can pass any file-like object; The object just needs to define compatible write and flush methods. Theoretically, you could even log to a file by creating a handler like StreamHandler(open("log.txt", "a")) - but in that case, it’s better to use a FileHandler, so it can manage opening and closing the file.

When creating a handler, your needs are nearly always met by either StreamHandler or FileHandler. There are other predefined handlers, too, useful when logging to certain specialized destinations:

WatchedFileHandler and RotatingFileHandler, for logging to rotated log files
SocketHandler and DatagramHandler for logging over network sockets
HTTPHandler for logging over an HTTP REST interface
QueueHandler and QueueListener for queuing log records across thread and process boundaries

See the official docs^[34] for more details.

Logging to Multiple Destinations

Suppose you want your long-running application to log all messages to a file, including debug-level records. At the same time, you want warnings, errors, and criticals logged to the console. How do you do this?

We’ve given you part of the answer already. A single logger object can have multiple handlers: all you have to do is call addHandler multiple times, passing a different handler object for each. For example, here is how you parrot all log messages to the console (via standard error) and also to a file:

import logging
logger = logging.getLogger()
# Remember, StreamHandler defaults to using sys.stderr
console_handler = logging.StreamHandler()
logger.addHandler(console_handler)
# Now the file handler:
logfile_handler = logging.FileHandler("log.txt")
logger.addHandler(logfile_handler)
logger.warning(
    "This goes to both the console, AND log.txt.")

This is combining what we learned above. We create two handlers - a StreamHandler named console_handler, and a FileHandler named logfile_handler - and add both to the same logger (via addHandler). That’s all you need to log to multiple destinations in parallel. Sure enough, if you save the above in a script and run it, you’ll find the messages are both written into "log.txt", as well as printed on the console (through standard error).

We aren’t done, though. How do we make it so every record is written in the log file, but only those of logging.WARNING or higher get sent to the console screen? Do this by setting log level thresholds for both the logger object and the individual handlers. Both logger objects and handlers have a method called setLevel, taking a log level threshold as an argument:

my_logger.setLevel(logging.DEBUG)
my_handler.setLevel(logging.INFO)

If you set the level for a logger, but not its handlers, the handlers inherit from the logger:

my_logger.setLevel(logging.ERROR)
my_logger.addHandler(my_handler)
my_logger.error("This message is emitted by my_handler.")
my_logger.debug("But this message will not.")

And you can override that at the handler level. Here, I create two handlers. One handler inherits its threshold from the logger, while the other does its own thing:

import logging
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)

verbose_handler = logging.FileHandler("verbose.txt")
logger.addHandler(verbose_handler)

terse_handler = logging.FileHandler("terse.txt")
terse_handler.setLevel(logging.WARNING)
logger.addHandler(terse_handler)

logger.debug("This message appears in verbose.txt ONLY.")
logger.warning("And this message appears in both files.")

There’s a caveat, though: a handler can only make itself more selective than its logger, not less. If the logger chooses a threshold of logger.DEBUG, its handler can choose a threshold of logger.INFO, or logger.ERROR, and so on. But if the logger defines a strict threshold - say, logger.INFO - an individual handler cannot choose a lower one, like logger.DEBUG. So something like this won’t work:

# This doesn't quite work...
import logging
my_logger = logging.getLogger()
my_logger.setLevel(logging.INFO)
my_handler = logging.StreamHandler()
my_handler.setLevel(logging.DEBUG) # FAIL!
my_logger.addHandler(my_handler)
my_logger.debug("No one will ever see this message :(")

There’s a subtle corollary of this. By default, a logger object’s threshold is set to logger.WARNING. So if you don’t set the logger object’s log level at all, it implicitly censors all handlers:

import logging
my_logger = logging.getLogger()
my_handler = logging.StreamHandler()
my_handler.setLevel(logging.DEBUG) # FAIL!
my_logger.addHandler(my_handler)
# No one will see this message either.
my_logger.debug(
    "Because anything under WARNING gets filtered.")

The logger object’s default log level is not always permissive enough for all handlers you might want to define. So you will generally want to start by setting the logger object to the lowest threshold needed by any log-record destination, and tighten that threshold for each handler as needed.

Bringing this all together, we can now accomplish what we originally wanted - to verbosely log everything into a log file, while duplicating only the more interesting messages onto the console:

import logging
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
# Warnings and higher only on the console.
console_handler = logging.StreamHandler()
console_handler.setLevel(logging.WARNING)
logger.addHandler(console_handler)
# But allow everything to go into the log file.
logfile_handler = logging.FileHandler("log.txt")
logger.addHandler(logfile_handler)

logger.warning(
    "This goes to both the console, AND into log.txt.")
logger.debug("While this only goes to the file.")

Add as many handlers as you want. Each can have different log levels. You can log to many different destinations, using the different built-in handler types mentioned above. If those don’t do what you need, implement your own subclass of logging.Handler and use that.

Record Layout with Formatters

We haven’t covered one important detail. So far, we’ve only shown you how to create logger objects that will write just the log message and nothing else. At the very least, you probably want to annotate that with the log level. You may also want to insert the time, or some other information. How do you do that?

The answer is to use a formatter. A formatter converts a log record into something that is recorded in the handler’s destination. That’s an abstract way of saying it; more simply, a typical formatter just converts the record into a usefully-formatted string. That string contains the actual log message, as well as the other fields you care about.

The procedure is to create a Formatter object, then associate it with a handler (using the latter’s setFormatter method). Creating a formatter is easy - it normally takes just one argument, the format string:

import logging
my_handler = logging.StreamHandler()
fmt = logging.Formatter("My message is: %(message)s")
my_handler.setFormatter(fmt)
my_logger = logging.getLogger()
my_logger.addHandler(my_handler)
my_logger.warning("WAKE UP!!")

If you run this in a script, the output will be:

My message is: WAKE UP!!

Notice the attribute for the message, %(message)s, included in the string. This is just a normal formatting string, in the older, percent-formatting style. It’s exactly equivalent to using the format argument when you call basicConfig. For this reason, you can use the same attributes, arranged however you like - here’s the attribute table again, distilled from the full official list:^[35]

Attribute	Format	Description
asctime	`%(asctime)s`	Human-readable date/time
funcName	`%(funcName)s`	Name of function containing the logging call
lineno	`%(lineno)d`	The line number of the logging call
message	`%(message)s`	The log message
pathname	`%(pathname)s`	Full pathname of the source file of the logging call
levelname	`%(levelname)s`	Text logging level for the message ('DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL')
name	`%(name)s`	The logger’s name

^[30]These beautifully crisp descriptions, which I cannot improve upon, are taken from https://docs.python.org/3/howto/logging.html .

^[31]These restrictions aren’t in place for logger objects, described later.

^[32]https://docs.python.org/3/library/logging.html#logrecord-attributes

^[33]https://docs.python.org/3/howto/logging-cookbook.html#use-of-alternative-formatting-styles

^[34]https://docs.python.org/3/library/logging.handlers.html

^[35]https://docs.python.org/3/library/logging.html#logrecord-attributes

Next Chapter: Python Virtual Environments

Previous Chapter: Automated Testing and TDD