As part of a workshop I'm preparing for, I decided to do a deep dive into Python's built-in logging module from the Python Standard Library. I always had just used the module without thinking too heavily about how it was put together or how it worked.
import logging and then away I went. Well, that's about to change, and I thought you might enjoy coming along for the ride. This post will go down into some of the module, especially the bits and thoughts that aren't detailed in the library reference, but I'm sure I'll miss spots along the way. We'll come back to it eventually!
The Python standard logging module makes the basics of logging easy for a developer. Import the module, then use the built-in error levels to send messages to standard out or standard error (
stderr). The error levels are
critical. Pretty simple, pretty obvious&emdash;just the reason that I love Python. As you might expect from a logging system, the debug and info levels are suppressed by default. Python also allows you to configure the module to send those lower levels through direct changes in the Python interpreter if you're calling your script in the interpreter, through a command-line argument, or through a config file you've generated yourself. Python, like many other languages, doesn't tell you what to use each logging level for. It doesn't decide that for you. Knowing what to send to where at which level is an art in and of itself as many an ops or devops person would tell you after they've seen overly verbose logs and nearly non-existent logs. Perhaps I'll dive into the different logging levels in general in a different post.
The module is really made of four main classes: loggers, handlers, filters, and formatters. Think of handlers as the orchestration layer, with loggers doing the actual interaction, filters doing the job of deciding what to send to the handlers, and formatters generating the strings to send on to the handlers. There are as many handlers as there are destinations for your logs, from
stderr to files to cloud logging providers. Logger objects can be created in a hierarchy of loggers, meaning you could create a logger for a parent class, for example, then have loggers for each child class that inherits from that parent logger and propagates logs back up the chain through the handlers. That sets you up to use filters and send logs to different places based on the handlers. Filters can be attached to loggers or handlers, meaning they are explicitly added to one or the other through a method call on the logger or handler. One interesting attribute of filters is that you can use them to add context details to a log line, which is interesting to me due to the structured logging formats that we'll see in a minute. Finally, formatters take in
LogRecord objects and generate the final message structure you want to send to a handler. While these messages are often strings, you can send dictionaries or objects, too.
Some other, less known bits from the logging module include
LogRecord class is the actual object that holds all of the pertinent log data. The class comes with a number of parameters that are already preconfigured that the standard logging module recognizes. However, you can also modify the base log record factory or create a new one to use in addition to the base one to create more attributes for the log record to hold. That's really important for structured logging because good structured logs tell the system exactly what it needs to know without being too large of an object to parse. So rather than having extraneous data that you might not need in one app versus another, you can target the data that you actually want that will be helpful. Reducing noise while providing actionable information is always helpful. The
LoggerAdapter, on the other hand, can help you add context to your logs, which means you can send stuff like network data or other details that your logger might not necessarily collect on its own. I think the basic idea is that you would use an adapter rather than a filter in cases where you might need to send data once with a specific log line in a specific section of your code so you don't have to write a whole new filter just for that tiny use case.
Now, the workshop I'm working on is about structured logging. I can certainly get into why we should use structured logs, but I'd rather stick to the deep dive on Python specifically here. In reality, yes, you can implement structured logs yourself in Python, and the Python maintainers have added the how-to to their documentation as a cookbook. There are also libraries now that can help you implement structured logs. An example is structlog, a Python library. As always with Python, loggers, handlers, filters, formatters, log records, and adapters are all objects. That means they all have attributes and methods. You can modify any object in the pipeline to match your specific need, including to add structured log formats. I've been playing with structlog, and it seems like a simple plug-and-play setup if you don't feel like coding the necessary bits yourself. However, I'm of the camp where you build it yourself if it's simple, mainly because of the leftpad controversy from the NPM/Node.JS world a while ago. It's really up to you to see just how complicated of a system you need for your logs.
Interested in learning what I've found in general about logging more useful data and give it a shot yourself? You can join me at DeveloperWeek Austin 2019 for my workshop if you're in the area, or you can get a sneak preview if you're early or just enjoy the content on the workshop repo. I'll likely be working on it right up until the end of October 2019, and then probably tweaking it based on feedback, so enjoy! Feel free to also join the conversation on my Twitter account.