Chicoscience
Chicoscience

Reputation: 1075

Python logging: Only write formatter content (date, level) if a new line is starting

I am using the logging package in Python.

When creating a handler, I use:

handler.terminator = ""

... so that by default the line does not end when calling the info or debug function. I use it to log things like this:

Writing applications in... 1.29s
Writing assets in... 2.34s

In the above the computational time is written at a second log call. The formatter is empty. I now want to add a formatter, and naturally I get this:

20220206 22:20:02 [INFO] Writing applications in... 20220206 22:20:03 [INFO] 1.29s
20220206 22:20:03 [INFO] Writing assets in... 20220206 22:20:05 [INFO] 2.34s

Is it possible to ensure that the formatter is only applied to when a new line is beginning? Like this, ideally:

20220206 22:20:02 [INFO] Writing applications in... 1.29s
20220206 22:20:03 [INFO] Writing assets in... 2.34s

Thank you very much.

Upvotes: 1

Views: 620

Answers (1)

Lenormju
Lenormju

Reputation: 4378

Minimal Reproducible Example with no formatter :

import logging
import random
import sys
random.seed(71012594)

logger = logging.getLogger("so71012594")
handler = logging.StreamHandler()
handler.terminator = ""
logger.addHandler(handler)
logger.setLevel(logging.DEBUG)

logger.debug("Doing something ... ")
duration = random.random() * 10  # a float between 0 and 10
logger.debug(f"in {duration} seconds\n")

logger.debug("Doing something else ... ")
duration = random.random() * 10
logger.debug(f"in {duration} seconds\n")
Doing something ... in 8.44033947716514 seconds
Doing something else ... in 9.921684947596866 seconds

Adding the formatter :

# I chose to keep a short format string for readability
formatter = logging.Formatter("[%(levelname)s] %(message)s")
handler.setFormatter(formatter)

logger.debug("Doing something ... ")
# ...
[DEBUG] Doing something ... [DEBUG] in 8.44033947716514 seconds
[DEBUG] Doing something else ... [DEBUG] in 9.921684947596866 seconds

Looking at the CPython implementation for logging.StreamHandler in 3.10 there are only 2 occurences of terminator :

class StreamHandler(Handler):
    # ...
    terminator = '\n'

and

    def emit(self, record):
        """
        ...
        """
        try:
            msg = self.format(record)
            stream = self.stream
            # issue 35046: merged two stream.writes into one.
            stream.write(msg + self.terminator)
            self.flush()
        except RecursionError:  # See issue 36272
            raise
        except Exception:
            self.handleError(record)

So the terminator is always appended to the log line, which is the formatting applied to the message.

I see different options :

  1. don't call format at all (conditional call to format by emit)
  2. change what format does (conditional formatting)

Solution 1

There are different ways to implement it, let's do something quick and dirty : monkey-patching emit !

original_stream_handler_emit = handler.emit  # getting a reference to the bound method
def my_emit(record):
    print(repr(record))
    return original_stream_handler_emit(record)  # call the bound method like a function
handler.emit = my_emit  # monkey-patching

logger.debug("Doing something ... ")
# ...
<LogRecord: so71012594, 10, /home/stack_overflow/so71012594.py, 21, "Doing something ... ">
<LogRecord: so71012594, 10, /home/stack_overflow/so71012594.py, 23, "in 8.44033947716514 seconds
">
<LogRecord: so71012594, 10, /home/stack_overflow/so71012594.py, 25, "Doing something else ... ">
<LogRecord: so71012594, 10, /home/stack_overflow/so71012594.py, 27, "in 9.921684947596866 seconds
">
[DEBUG] Doing something ... [DEBUG] in 8.44033947716514 seconds
[DEBUG] Doing something else ... [DEBUG] in 9.921684947596866 seconds

Now, we just have to detect whether there is a newline in the message :

original_stream_handler_emit = handler.emit  # getting a reference to the bound method
def my_emit(record):
    self = handler
    if "\n" in record.msg:
        # it is the second part of a message, we don't want any pesky formatting
        try:
            self.stream.write(record.msg)  # no formatting applied on the record !
        except Exception:
            self.handleError(record)
    else:
        # simply call the original
        original_stream_handler_emit(record)  # call the bound method like a function
handler.emit = my_emit  # monkey-patching

logger.debug("Doing something ... ")
# ...
[DEBUG] Doing something ... in 8.44033947716514 seconds
[DEBUG] Doing something else ... in 9.921684947596866 seconds

Monkey patching is simple to write but not very maintainable.

Another way is to create a StreamHandler subclass :

class MyStreamHandler(logging.StreamHandler):
    def emit(self, record):
        if "\n" in record.msg:
            # it is the second part of a message, we don't want any pesky formatting
            try:
                self.stream.write(record.msg)  # no formatting applied on the record !
            except Exception:
                self.handleError(record)
        else:
            # simply call the original
            super().emit(record)  # logging.StreamHandler.emit

[...]
handler = MyStreamHandler()  # instead of logging.StreamHandler

Which is much better in my opinion.

Solution 2

There already exists similar question on this site, for example "Can Python's logging format be modified depending on the message log level?".

Here is how to do it :

class MyFormatter(logging.Formatter):
    def format(self, record):
        if "\n" in record.msg:
            # it is the second part of a message, we don't want any pesky formatting
            return record.msg
        else:
            # simply call the original
            return super().format(record)  # logging.Formatter.format

formatter = MyFormatter("[%(levelname)s] %(message)s")
handler.setFormatter(formatter)

Limitations for the solutions 1 and 2

Relying on record.msg is wonky :

logger.debug("Hello\nWorld!")
[DEBUG] Doing something ... in 8.44033947716514 seconds
[DEBUG] Doing something else ... in 9.921684947596866 seconds
Hello
World!

There is no formatting at all for the record because it contains a newline. While it is true that most log messages should be on one line, that's not always true.

Also :

logger.propagate = False  # for demonstration purposes
try:
    1/0
except ZeroDivisionError:
    logging.exception("math error\nThese pesky zeroes !")  # this one using the default logger/formatter
    print("\n---\n", file=sys.stderr)
    logger.exception("math error\nThese pesky zeroes !")  # this one using the custom formatter
ERROR:root:math error
These pesky zeroes !
Traceback (most recent call last):
  File "/home/S0121595/workspace/stack_overflow/so71012594.py", line 65, in <module>
    1/0
ZeroDivisionError: division by zero

---

math error
These pesky zeroes !

There is no stacktrace displayed nor indication there was an error, because all that normally would get added by the regular format function, they are not included in the record.msg field.

It also breaks when *args are supplied :

logger.propagate = False ; logging.getLogger().setLevel(logging.DEBUG)  # for demonstration purposes
logging.debug("part one,\n %s", "part two")  # this one using the default logger/formatter
print("\n---\n", file=sys.stderr)
logger.debug("part one,\n %s", "part two")  # this one using the custom formatter
DEBUG:root:part one,
 part two

---

part one,
 %s

So either you can ignore all these cases, or you should implement them in your solution. Here is an untested example based on CPython's logging.Fomatter.format :

    def format(self, record):
        """
        ...
        """
        record.message = record.getMessage()
        if self.usesTime():
            record.asctime = self.formatTime(record, self.datefmt)
        s = self.formatMessage(record)
        s = record.message
        if record.exc_info:
            # Cache the traceback text to avoid converting it multiple times
            # (it's constant anyway)
            if not record.exc_text:
                record.exc_text = self.formatException(record.exc_info)
        if record.exc_text:
            if s[-1:] != "\n":
                s = s + "\n"
            s = s + record.exc_text
        if record.stack_info:
            if s[-1:] != "\n":
                s = s + "\n"
            s = s + self.formatStack(record.stack_info)
        return s

(the diff syntax is not supported on StackOverflow)

For all these reasons, I think it is best to not rely on detecting \n.

Solution 3

Basically what you want is :

# first message :    no newline here
                           v
[DEBUG] Doing something ... in 8.44033947716514 seconds
                           ^
# second message :   no formatting here

So I propose a clearer way to indicate that :

#                                    vvvvvvvvvvvvvvvvv
logger.debug("Doing something ... ", extra={"end": ""})
logger.debug(f"in {duration} seconds", extra={"no_format": True})
#                                      ^^^^^^^^^^^^^^^^^^^^^^^^^

I'm reusing the end= convention of print with the extra parameter of all logging functions.

The Logger.makeRecord method takes care of converting the extra pairs into record fields :

original_stream_handler_emit = handler.emit  # getting a reference to the bound method
def my_emit(record):
    print("end:", repr(getattr(record, "end", None)))
    print("no_format: ", getattr(record, "no_format", None))
    return original_stream_handler_emit(record)  # call the bound method like a function
handler.emit = my_emit  # monkey-patching
end: ''
no_format:  None
end: None
no_format:  True

So here is the full code :

import contextlib  # stdlib
import logging
import random

random.seed(71012594)

@contextlib.contextmanager
def patch_object(obj, field_name, temp_value):  # basically `mock.patch.object`
    original_value = getattr(obj, field_name)  # make a backup
    setattr(obj, field_name, temp_value)  # set the new value
    try:
        yield temp_value
    finally:
        setattr(obj, field_name, original_value)  # restore backup in any case

class MyStreamHandler2(logging.StreamHandler):
    def emit(self, record):
        end_value = getattr(record, "end", None)
        use_end_value = end_value is not None and end_value != self.terminator

        no_format = getattr(record, "no_format", False)
        fake_format_function = (lambda rec: rec.message)

        with patch_object(self, "terminator", end_value) if use_end_value else contextlib.nullcontext(), \
                patch_object(self.formatter, "formatMessage", fake_format_function) if no_format else contextlib.nullcontext():
            super().emit(record)


logger = logging.getLogger("so71012594")
handler = MyStreamHandler2()  # instead of logging.StreamHandler
#handler.terminator = ""  # No need to replace the terminator !
logger.addHandler(handler)
logger.setLevel(logging.DEBUG)
formatter = logging.Formatter("[%(levelname)s] %(message)s")
handler.setFormatter(formatter)


logger.debug("Doing something ... ", extra={"end": ""})
duration = random.random() * 10  # a float between 0 and 10
logger.debug(f"in {duration} seconds", extra={"no_format": True})

logger.debug("Doing something else ... ", extra={"end": ""})
duration = random.random() * 10
logger.debug(f"in {duration} seconds", extra={"no_format": True})
[DEBUG] Doing something ... in 8.44033947716514 seconds
[DEBUG] Doing something else ... in 9.921684947596866 seconds

It can be strange to include unittest (specially its mock subpackage) in the non-test code, but I find it better for monkey-patching than reimplementing it each time, or taking risks by simply setting/resetting values. It is simply that :

from unittest.mock import patch  # stdlib

# remove the `patch_object` function entirely

[...]

with patch.object(...
        patch.object(...

And it works like a charm.

Note : the end value can be anything !

logger.debug("Doing something ... ", extra={"end": "πŸ‘βœ¨\tπŸš’βž‘ ️"})
[DEBUG] Doing something ... πŸ‘βœ¨ πŸš’βž‘ ️in 8.44033947716514 seconds

Bonus : you can now have as many parts as you want to form a single line !

import time
for _ in range(4):
    duration = random.random() * 10
    logger.debug("Starting to do something long ", extra={"end": ""})
    for elapsed in range(int(duration)+1):
        logger.debug(".", extra={"end": "", "no_format": True})
        time.sleep(1)
    logger.debug(f" finished ({duration:.1f}s) !", extra={"no_format": True})
[DEBUG] Starting to do something long ......... finished (8.4s) !
[DEBUG] Starting to do something long .......... finished (9.9s) !
[DEBUG] Starting to do something long .. finished (1.4s) !
[DEBUG] Starting to do something long ......... finished (8.6s) !

(the dots appear progressively)

Upvotes: 1

Related Questions