Logging in Python
Logging is critical in many kinds of software. For long-running software systems, it enables continuous telemetry and reporting. And for all software, it can provide priceless information for troubleshooting and post-mortems. The bigger the application, the more important logging becomes. But even small scripts can benefit.
Python provides logging through the logging module. In my opinion,
this module is one of the more technically impressive parts of
Python’s standard library. It’s well-designed, flexible, thread-safe,
and richly powerful. It’s also complex, with many moving parts, making
it hard to learn well. This chapter gets you over most of that
learning curve, so you can fully benefit from what logging has to
offer. The payoff is well worth it, and will serve you for years.
Broadly, there are two ways to use logging. One, which I’m calling
the basic interface,
is appropriate for scripts - meaning, Python programs that are small
enough to fit in a single file. For more substantial applications,
it’s typically better to use logger objects, which give more flexible, centralized
control, and access to logging hierarchies. We’ll start with the
former, to introduce key ideas.
The Basic Interface
Here’s the easiest way to use Python’s logging module:
- import logging
- logging.warning('Look out!')
Save this in a script and run it, and you’ll see this printed out:
WARNING:root:Look out!
You can do useful logging right away, by calling functions in the
logging module itself. Notice you invoke logging.warning(), and the
output line starts with WARNING. You can also call
logging.error(), which gives a different prefix:
ERROR:root:Look out!
We say that warning and error are
at different message log levels. You have a spectrum of log levels
to choose from, in order of increasing severity:[30]
- debug
- Detailed information, typically of interest only when diagnosing problems.
- info
- Confirmation that things are working as expected.
- warning
- An indication that something unexpected happened, or indicative of some problem in the near future (e.g. ‘disk space low’). The software is still working as expected.
- error
- Due to a more serious problem, the software has not been able to perform some function.
- critical
- A serious error, indicating that the program itself may be unable to continue running.
You use them all just like logging.warning() and logging.error():
- logging.debug("Small detail. Useful for troubleshooting.")
- logging.info("This is informative.")
- logging.warning("This is a warning message.")
- logging.error("Uh oh. Something went wrong.")
- logging.critical("We have a big problem!")
Each has a corresponding uppercased constant in the library (e.g.,
logging.WARNING for logging.warning()). You use these when
defining the log level threshold. Run the above, and here is the
output:
WARNING:root:This is a warning message. ERROR:root:Uh oh. Something went wrong. CRITICAL:root:We have a big problem!
Where did the debug and info messages go? As it turns out, the default
logging threshold is logging.WARNING, which means only messages of
that severity or greater are actually generated; the others are
ignored completely. The order matters in the list above; debug is
considered strictly less severe than info, and so on. Change the log
level threshold using the basicConfig function:
- logging.basicConfig(level=logging.INFO)
- logging.info("This is informative.")
- logging.error("Uh oh. Something went wrong.")
Run this new program, and the INFO message gets printed:
INFO:root:This is informative. ERROR:root:Uh oh. Something went wrong.
Again, the order is debug(), info(), warning(), error() and
critical(), from lowest to highest severity. When we set the log
level threshold, we declare that we only want to see messages of that
level or higher. Messages of a lower level are not printed. When you set
level to logging.DEBUG, you see everything; set it to
logging.CRITICAL, and you only see critical messages, and so on.
The phrase "log level" means two different things, depending on
context. It can mean the severity of a message, which you set by
choosing which of the functions to use - logging.warning(), etc. Or
it can mean the threshold for ignoring messages, which is signaled by
the constants: logging.WARNING, etc.
You can also use the constants in the more general logging.log
function - for example, a debug message:
- logging.log(logging.DEBUG,
- "Small detail. Useful for troubleshooting.")
- logging.log(logging.INFO, "This is informative.")
- logging.log(logging.WARNING, "This is a warning message.")
- logging.log(logging.ERROR, "Uh oh. Something went wrong.")
- logging.log(logging.CRITICAL, "We have a big problem!")
This lets you control the log level dynamically, at runtime. For example:
- def log_results(message, level=logging.INFO):
- logging.log(level, "Results: " + message)
Why do we have log levels?
If you haven’t worked with similar logging systems before, you may
wonder why we have different log levels, and why you’d want to control
the filtering threshold. It’s easiest to see this if you’ve written
Python scripts that include a number of print() statements -
including some useful for diagnosis when something goes wrong, but a
distraction when everything is working fine.
The fact is, some of those print() statements are more important
than others. Some indicate mission-critical problems you always want
to know about - possibly to the point of waking up an engineer, so
they can deploy a fix immediately. Some are important, but can wait
until the next work day - and you definitely do NOT want to wake
anyone up for that. Some are details which may have been important in
the past, and might be in the future, so you don’t want to remove
them; in the meantime, they are just line noise.
Having log levels solves all these problems. As you develop and evolve
your code over time, you continually add new logging statements of the
appropriate severity. You now even have the freedom to be
proactive. With "logging" via print(), each log statement has a cost
- certainly in signal-to-noise ratio, and also potentially in
performance. So you might debate whether to include that print
statement at all. But with logging, you can insert info messages,
for example, to log certain events occurring as they should. In
development, those INFO messages can be very useful to verify certain
things are happening, so you can modify the log level to produce
them. On production, you may not want to have them cluttering up the
logs, so you just set the threshold higher. Or if you are doing some
kind of monitoring on production, and temporarily need that
information, you can adjust the log level threshold to output those
messages; when you are finished, you can adjust it back to exclude
them again.
When troubleshooting, you can liberally introduce debug-level
statements to provide extra detailed statements. When done, you can
just adjust the log level to turn them off. You can leave them in the
code without cost, eliminating any risk of introducing more bugs when
you go through and remove them. This also leaves them available if
they are needed in the future.
The log level
symbols are actually set to integers. You can theoretically use these
numbers instead, or even define your own log levels that are (for
example) a third of the way between WARNING and ERROR. In normal
practice, it’s best to use the predefined logging levels. Doing
otherwise makes your code harder to read and maintain, and isn’t
worthwhile unless you have a compelling reason.
For reference, the numbers are 50 for CRITICAL, 40 for ERROR, 30
for WARNING, 20 for INFO, and 10 for DEBUG. So when you set the
log level threshold, it’s actually setting a number. The only log
messages emitted are those with a level greater than or equal to that
number.
Configuring The Basic Interface
You saw above you can change the loglevel threshold by calling a
function called basicConfig:
- logging.basicConfig(level=logging.INFO)
- logging.debug("You won't see this message!")
- logging.error("But you will see this one.")
If you use it at all, basicConfig must be called exactly once, and
it must happen before the first logging event. (Meaning, before the
first call to debug(), or warning(), etc.) Additionally, if your
program has several threads, it must be called from the main thread -
and only the main thread. [31]
You already met one of the configuration options, level. This is set
to the log level threshold, and is one of DEBUG, INFO, WARNING,
ERROR, or CRITICAL. Some of the other options include:
filename- Write log messages to the given file, rather than stderr.
filemode- Set to "a" to append to the log file (the default), or "w" to overwrite.
format- The format of log records.
level- The log level threshold, described above.
By default, log messages are written to standard error. You can also
write them to a file, one per line, to easily read later. Do this by
setting filename to the log file path. By default it appends log
messages, meaning that it will only add to the end of the file if it
isn’t empty. If you’d rather the file be emptied before the first
log message, set filemode to "w". Be careful about doing that, of course,
because you can easily lose old log messages if the application
restarts:
- # Wipes out previous log entries when program restarts
- logging.basicConfig(filename="log.txt", filemode="w")
- logging.error("oops")
The other valid value is "a", for append - that’s the default, and probably will serve you better in production. "w" can be useful during development, though.
format defines what chunks of information the final log record will
include, and how they are laid out. These chunks are called
"attributes" in the logging module docs. One of these attributes is
the actual log message - the string you pass when you call
logging.warning(), and so on. Often you will want to include other
attributes as well. Consider the kind of log record we saw above:
WARNING:root:Collision imminent
This record has three attributes, separated by colons. First is the
log level name; last is the actual string message you pass when you
call logging.warning(). (In the middle is the name of the underlying
logger object. basicConfig uses a logger called "root"; we’ll learn
more about that later.)
You specify the layout you want by setting format to a string that
defines certain named fields, according to percent-style
formatting. Three of them are levelname, the log level; message,
the message string passed to the logging function; and name, the
name of the underlying logger. Here’s an example:
- logging.basicConfig(
- format="Log level: %(levelname)s, msg: %(message)s")
- logging.warning("Collision imminent")
If you run this as a program, you get the following output:
Log level: WARNING, msg: Collision imminent
It turns out the default formatting string is
%(levelname)s:%(name)s:%(message)s
You indicate named fields in percent-formatting by %(FIELDNAME)X,
where "X" is a type code: s for string, d for integer (decimal),
and f for floating-point.
Many other attributes are provided, if you want to include them. Here’s a select few from the full list:[32]
| Attribute | Format | Description |
|---|---|---|
asctime |
| Human-readable date/time |
funcName |
| Name of function containing the logging call |
lineno |
| The line number of the logging call |
message |
| The log message |
pathname |
| Full pathname of the source file of the logging call |
levelname |
| Text logging level for the message ('DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL') |
name |
| The logger’s name |
You might be wondering why log record format strings use Python 2’s ancient percent-formatting style, even in modern Python. As it turns out, the conversion was attempted, but backwards-compatibility reasons made percent-formatting the only practical choice for the logging module, even after the Python 3 reboot.
If you want to use the newer string formatting badly enough, there are things you can do - there’s even a standard recipe.[33] But doing so is complicated enough that it may not be worth the effort, and it won’t help with legacy code. I recommend you simply cooperate with the situation, and use percent formatting with your Python logging.
Passing Arguments
You often want to include some kind of runtime data in the logged message. When you construct the message to log, specify the final log message like this:
- num_fruits = 14
- fruit_name = "oranges"
- logging.info(
- "We ate %d of your %s. Thanks!",
- num_fruits, fruit_name)
The output:
INFO:root:We ate 14 of your oranges. Thanks!
We call info with three parameters. First is the format
string; the second and third are arguments. The general form is
logging.info(format, *args)
You can pass zero or more arguments, so long as each has a field in the format string:
# Do it like this:
logging.info("%s, %s, %s, %s, %s, %s and %s",
"Doc", "Happy", "Sneezy", "Bashful",
"Dopey", "Sleepy", "Grumpy")
You must resist the obvious temptation to format the string fully, and pass that to the logging function:
- num_fruits = 14
- fruit_name = "oranges"
- logging.warning(
- "Don't do this: %d %s" % (num_fruits, fruit_name))
- logging.warning(
- "Or this: {:d} {:s}".format(
- num_fruits, fruit_name))
- logging.warning(
- f"And definitely not this: {num_fruits:d} {fruit_name:s}")
All of these work, of course, in the sense that you will get correct log
messages. But it surrenders important performance benefits
logging normally provides. Remember: when the line of code
with the log message is executed, it may not actually trigger a
log event. If the log level threshold is higher than the message
itself, the line does nothing. In that case, there is no
reason to format the string.
In the "do it like this" form, the string is formatted if and only if a log event actually happens, so that’s fine. But if you format the string yourself, it’s always formatted. That takes up system memory and CPU cycles even if no logging takes place. If the code path with the logging call is only executed occasionally, that’s not a big deal. But it impairs the program when a debug message is logged in the middle of a tight loop. When you originally code the line, you never really know where it might migrate in the future, or who will call your function in ways you never imagined.
So just use the supported form, where the first argument is the format string, and subsequent arguments are the parameters for it. You can also use named fields, by passing a dictionary as the second argument - though you must use the percent-formatting named-field format:
- fruit_info = {"count": 14, "name": "oranges"}
- logging.info(
- "We ate %(count)d of your %(name)s. Thanks!",
- fruit_info)
Beyond Basic: Loggers
The basic interface is simple and easy to set up. It works well in
single-file scripts. Larger Python applications tend to have different
logging needs, however. logging meets these needs through a richer
interface, called logger objects - or simply, loggers.
Actually, you have been using a logger object all along: when you call
logging.warning() (or the other log functions), they convey messages
through what is called the root logger - the primary, default logger
object. This is why the word "root" shows in some example
output.
logger.basicConfig operates on this root logger. You can fetch the
actual root logger object by calling logging.getLogger:
>>> logger = logging.getLogger() >>> logger.name 'root'
As you can see, it knows its name is "root". Logger objects have all
the same functions (methods, actually) the logging module itself
has:
- import logging
- logger = logging.getLogger()
- logger.debug("Small detail. Useful for troubleshooting.")
- logger.info("This is informative.")
- logger.warning("This is a warning message.")
- logger.error("Uh oh. Something went wrong.")
- logger.critical("We have a big problem!")
Save this in a file and run it, and you’ll see the following output:
This is a warning message. Uh oh. Something went wrong. We have a big problem!
This is different from what we saw with basicConfig, which printed
out this instead:
WARNING:root:This is a warning message. ERROR:root:Uh oh. Something went wrong. CRITICAL:root:We have a big problem!
At this point, we’ve taken steps backward compared to
basicConfig. Not only is the log message unadorned by the log level,
or anything else useful. The log level threshold is hard-coded to
logging.WARNING, with no way to change it. The logging output will
be written to standard error, and no where else, regardless of where
you actually need it to go.
Let’s take inventory of what we want to control here. We want to choose our log record format. And further, we want to be able to control the log level threshold, and write messages to different streams and destinations. You do this with a tool called handlers.
Log Destinations: Handlers and Streams
By default, loggers write to standard error. You can select a different destination - or even several destinations - for each log record:
- You can write log records to a file. Very common.
- You can, while writing records to that file, also parrot it to stderr.
- Or to stdout. Or both.
- You can simultaneously log messages to two different files.
-
In fact, you can log (say)
INFOand higher messages to one file, andERRORand higher to another. - You can write log records to a remote log server, accessed via a REST HTTP API.
- Mix and match all the above, and more.
- And you can set a different, custom log format for each destination.
This is all managed through what are called handlers. In Python logging, a handler’s job is to take a log record, and make sure it gets recorded in the appropriate destination. That destination can be a file; a stream like stderr or stdout; or something more abstract, like inserting into a queue, or transmitting via an RPC or HTTP call.
By default, logger objects don’t have any handlers. You can verify this
using the hasHandlers method:
- >>> logger = logging.getLogger()
- >>> logger.hasHandlers()
- False
With no handler, a logger has the following behavior:
- Messages are written to stderr.
- Only the message is written, nothing else. There’s no way to add fields or otherwise modify it.
-
The log level threshold is
logging.WARNING. There is no way to change that.
To change this, your first step is to create a handler. Nearly all logger objects you ever use will have custom handlers. Let’s see how to create a simple handler that writes messages to a file, called "log.txt".
- import logging
- logger = logging.getLogger()
- log_file_handler = logging.FileHandler("log.txt")
- logger.addHandler(log_file_handler)
- logger.debug("A little detail")
- logger.warning("Boo!")
The logging module provides a class called FileHandler. It takes a
file path argument, and will write log records into that file, one per
line. When you run this code, log.txt will be created (if it doesn’t
already exist), and will contain the string "Boo!" followed by a
newline. (If log.txt did exist already, the logged message would be
appended to the end of the file.)
But "A little detail" is not written, because it’s below the default
logger threshold of WARNING. We change that by calling a method
named setLevel on the logger object:
- import logging
- logger = logging.getLogger()
- logger.setLevel(logging.DEBUG)
- log_file_handler = logging.FileHandler("log.txt")
- logger.addHandler(log_file_handler)
- logger.debug("A little detail")
- logger.warning("Boo!")
This writes the following in "log.txt":
A little detail Boo!
Confusingly, you can call setLevel on a logger with no handlers,
but it has no effect:
- # Doing it wrong:
- import logging
- logger = logging.getLogger()
- logger.setLevel(logging.DEBUG) # No effect.
- logger.debug("This won't work :(")
To change the threshold from the default of logging.WARNING, you
must both add a handler, and change the logger’s level.
What if you want to log to stdout? Do that with a StreamHandler:
- import logging
- import sys
- logger = logging.getLogger()
- out_handler = logging.StreamHandler(sys.stdout)
- logger.addHandler(out_handler)
- logger.warning("Boo!")
If you save this in a file and run it, you’ll get "Boo!" on standard
output. Notice that logging.StreamHandler takes sys.stdout as its
argument. You can create a StreamHandler without an argument too, in
which case it will write its records to standard error:
- import logging
- logger = logging.getLogger()
- # Same as StreamHandler(sys.stderr)
- stderr_handler = logging.StreamHandler()
- logger.addHandler(stderr_handler)
- logger.warning("This goes to standard error")
In fact, you can pass any file-like object; The object just needs to
define compatible write and flush methods. Theoretically, you
could even log to a file by creating a handler like
StreamHandler(open("log.txt", "a")) - but in that case, it’s better
to use a FileHandler, so it can manage opening and closing the file.
When creating a handler, your needs are nearly always met by either
StreamHandler or FileHandler. There are other predefined handlers,
too, useful when logging to certain specialized destinations:
-
WatchedFileHandlerandRotatingFileHandler, for logging to rotated log files -
SocketHandlerandDatagramHandlerfor logging over network sockets -
HTTPHandlerfor logging over an HTTP REST interface -
QueueHandlerandQueueListenerfor queuing log records across thread and process boundaries
See the official docs[34] for more details.
Logging to Multiple Destinations
Suppose you want your long-running application to log all messages to a file, including debug-level records. At the same time, you want warnings, errors, and criticals logged to the console. How do you do this?
We’ve given you part of the answer already. A single logger object can
have multiple handlers: all you have to do is call addHandler
multiple times, passing a different handler object for each. For
example, here is how you parrot all log messages to the console (via
standard error) and also to a file:
- import logging
- logger = logging.getLogger()
- # Remember, StreamHandler defaults to using sys.stderr
- console_handler = logging.StreamHandler()
- logger.addHandler(console_handler)
- # Now the file handler:
- logfile_handler = logging.FileHandler("log.txt")
- logger.addHandler(logfile_handler)
- logger.warning(
- "This goes to both the console, AND log.txt.")
This is combining what we learned above. We create two handlers - a
StreamHandler named console_handler, and a FileHandler named
logfile_handler - and add both to the same logger (via
addHandler). That’s all you need to log to multiple destinations in
parallel. Sure enough, if you save the above in a script and run it,
you’ll find the messages are both written into "log.txt", as well as
printed on the console (through standard error).
We aren’t done, though. How do we make it so every record is written
in the log file, but only those of logging.WARNING or higher get
sent to the console screen? Do this by setting log level thresholds
for both the logger object and the individual handlers. Both logger
objects and handlers have a method called setLevel, taking a log
level threshold as an argument:
- my_logger.setLevel(logging.DEBUG)
- my_handler.setLevel(logging.INFO)
If you set the level for a logger, but not its handlers, the handlers inherit from the logger:
- my_logger.setLevel(logging.ERROR)
- my_logger.addHandler(my_handler)
- my_logger.error("This message is emitted by my_handler.")
- my_logger.debug("But this message will not.")
And you can override that at the handler level. Here, I create two handlers. One handler inherits its threshold from the logger, while the other does its own thing:
- import logging
- logger = logging.getLogger()
- logger.setLevel(logging.DEBUG)
-
- verbose_handler = logging.FileHandler("verbose.txt")
- logger.addHandler(verbose_handler)
-
- terse_handler = logging.FileHandler("terse.txt")
- terse_handler.setLevel(logging.WARNING)
- logger.addHandler(terse_handler)
-
- logger.debug("This message appears in verbose.txt ONLY.")
- logger.warning("And this message appears in both files.")
There’s a caveat, though: a handler can only make itself more
selective than its logger, not less. If the logger chooses a threshold
of logger.DEBUG, its handler can choose a threshold of
logger.INFO, or logger.ERROR, and so on. But if the logger defines
a strict threshold - say, logger.INFO - an individual handler cannot
choose a lower one, like logger.DEBUG. So something like this won’t
work:
- # This doesn't quite work...
- import logging
- my_logger = logging.getLogger()
- my_logger.setLevel(logging.INFO)
- my_handler = logging.StreamHandler()
- my_handler.setLevel(logging.DEBUG) # FAIL!
- my_logger.addHandler(my_handler)
- my_logger.debug("No one will ever see this message :(")
There’s a subtle corollary of this. By default, a logger object’s
threshold is set to logger.WARNING. So if you don’t set the logger
object’s log level at all, it implicitly censors all handlers:
- import logging
- my_logger = logging.getLogger()
- my_handler = logging.StreamHandler()
- my_handler.setLevel(logging.DEBUG) # FAIL!
- my_logger.addHandler(my_handler)
- # No one will see this message either.
- my_logger.debug(
- "Because anything under WARNING gets filtered.")
The logger object’s default log level is not always permissive enough for all handlers you might want to define. So you will generally want to start by setting the logger object to the lowest threshold needed by any log-record destination, and tighten that threshold for each handler as needed.
Bringing this all together, we can now accomplish what we originally wanted - to verbosely log everything into a log file, while duplicating only the more interesting messages onto the console:
- import logging
- logger = logging.getLogger()
- logger.setLevel(logging.DEBUG)
- # Warnings and higher only on the console.
- console_handler = logging.StreamHandler()
- console_handler.setLevel(logging.WARNING)
- logger.addHandler(console_handler)
- # But allow everything to go into the log file.
- logfile_handler = logging.FileHandler("log.txt")
- logger.addHandler(logfile_handler)
-
- logger.warning(
- "This goes to both the console, AND into log.txt.")
- logger.debug("While this only goes to the file.")
Add as many handlers as you want. Each can have different log
levels. You can log to many different destinations, using the
different built-in handler types mentioned above. If those don’t do
what you need, implement your own subclass of logging.Handler and
use that.
Record Layout with Formatters
We haven’t covered one important detail. So far, we’ve only shown you how to create logger objects that will write just the log message and nothing else. At the very least, you probably want to annotate that with the log level. You may also want to insert the time, or some other information. How do you do that?
The answer is to use a formatter. A formatter converts a log record into something that is recorded in the handler’s destination. That’s an abstract way of saying it; more simply, a typical formatter just converts the record into a usefully-formatted string. That string contains the actual log message, as well as the other fields you care about.
The procedure is to create a Formatter object, then associate it
with a handler (using the latter’s setFormatter method). Creating a
formatter is easy - it normally takes just one argument, the format
string:
- import logging
- my_handler = logging.StreamHandler()
- fmt = logging.Formatter("My message is: %(message)s")
- my_handler.setFormatter(fmt)
- my_logger = logging.getLogger()
- my_logger.addHandler(my_handler)
- my_logger.warning("WAKE UP!!")
If you run this in a script, the output will be:
My message is: WAKE UP!!
Notice the attribute for the message, %(message)s, included in the
string. This is just a normal formatting string, in the older,
percent-formatting style. It’s exactly equivalent to using the
format argument when you call basicConfig. For this reason, you
can use the same attributes, arranged however you like - here’s the
attribute table again, distilled from the
full official list:[35]
| Attribute | Format | Description |
|---|---|---|
asctime |
| Human-readable date/time |
funcName |
| Name of function containing the logging call |
lineno |
| The line number of the logging call |
message |
| The log message |
pathname |
| Full pathname of the source file of the logging call |
levelname |
| Text logging level for the message ('DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL') |
name |
| The logger’s name |
[30] These beautifully crisp descriptions, which I cannot improve upon, are taken from https://docs.python.org/3/howto/logging.html .
[31] These restrictions aren’t in place for logger objects, described later.
Next Chapter: Python Virtual Environments
Previous Chapter: Automated Testing and TDD