Remotec
Remotec

Reputation: 10760

Does python logging incur a performance hit if you log below the set level?

I'm looking at optimising some code.

For example if I have set Python's log level to info and I write code e.g.

logger.debug(....)

Does Python know that I am set to info level and effectively throw the debug statement away?

Is there a way to determine which log level is set and I could test this before doing more costly operations such as formatting strings for logging? Is it better to do this or just log and let Python determine what should be logged.

Which approach is better for performance?

Upvotes: 6

Views: 3283

Answers (4)

Markus Dutschke
Markus Dutschke

Reputation: 10606

use if logger.level <= to avoid expensive calculations

The problem is sometimes / often, that the expensive work is done, even before the logger is called. This means, if you have a function call logger.debug(fct_expensive()), fct_expensive is evaluated first and then the result is passed to logger.debug.

demo code

import logging
import time
import timeit

def fct_expensive():
    time.sleep(1)
    return "expensive result"

def approach_0_not_optimized(logger):
    logger.debug(fct_expensive())

def approach_1_if_loglevel(logger):
    if logger.level <= logging.DEBUG:
        logger.debug(fct_expensive())

if __name__ == "__main__":
    log_level_init = logging.INFO  # 20

    # set up logger: https://docs.python.org/3/howto/logging.html
    logger = logging.getLogger()
    logger.setLevel(log_level_init)
    ch = logging.StreamHandler()
    formatter = logging.Formatter('%(asctime)s - %(funcName)s - %(levelname)s - %(message)s')
    ch.setFormatter(formatter)
    logger.addHandler(ch)

    print(f"\n\n{logger.level=}")
    print(f"{timeit.timeit(lambda: approach_0_not_optimized(logger), number=1)=}")
    print(f"{timeit.timeit(lambda: approach_1_if_loglevel(logger), number=1)=}")

    logger.setLevel(logging.DEBUG)  # 10
    print(f"\n\n{logger.level=}")
    print(f"{timeit.timeit(lambda: approach_0_not_optimized(logger), number=1)=}")
    print(f"{timeit.timeit(lambda: approach_1_if_loglevel(logger), number=1)=}")

output

logger.level=20
timeit.timeit(lambda: approach_0_not_optimized(logger), number=1)=1.0010744459996204
timeit.timeit(lambda: approach_1_if_loglevel(logger), number=1)=2.0000006770715117e-06


logger.level=10
timeit.timeit(lambda: approach_0_not_optimized(logger), number=1)=1.001220555001055
2022-11-09 11:34:32,535 - approach_0_not_optimized - DEBUG - expensive result
2022-11-09 11:34:33,536 - approach_1_if_loglevel - DEBUG - expensive result
timeit.timeit(lambda: approach_1_if_loglevel(logger), number=1)=1.001176765999844

Please note the performance difference at logger.level==20 (logging.INFO): approach_1_if_loglevel is skipping the time.sleep(1) second expensive work part and hence takes only microseconds to execute.

Upvotes: 0

freakish
freakish

Reputation: 56477

Does Python know that I am set to info level and effectively throw the debug statement away?

First of all no Python statement ever disappears. Python is not like C that will just remove unused code. For Python there is no such thing as unused code (unfortunately). The call will always happen, formatting will always happen. So the only question is: is that really a performance issue and what logic happens underneath the call?

First let me address the .debug() call itself. Python doesn't know much on its own. It totally depends on logger implementation. By default Python loggers won't do anything when they are not enabled for a given level. However this behaviour can be overridden. And this is not some abstract, academic situation, third-party libraries do provide their own implementations all the time. On the other hand such behaviour (i.e. do something even though not allowed because of level) would be a very nasty and unfriendly one. And you can assume (as most of us do) that no such thing happens unless you have a reason to doubt it.

Which approach is better for performance?

Now for the string formatting. If you worry about it then IMO you are wasting time. This is a micro optimization and if you need this level of performance then most likely you should not be using Python to begin with.

The only reason logging could be problematic is if it is logging over network or possibly file system.

Either way: measure first, don't speculate.

That being said, if you really are worried about the performance then you can always utilize what these docs talk about:

if logger.isEnabledFor(logging.DEBUG):
    logger.debug('Message with %s, %s', expensive_func1(),
                                        expensive_func2())

But please do read docs and familiarize with potential issues with .isEnabledFor() call.

Upvotes: 5

Lucas
Lucas

Reputation: 715

You can save yourself the hassle of formatting the string for logging as the logging module offers you some string formatting on its own.

You should also save yourself the hassle of checking the level manually. That is just what the library is for.

The docs have this nice graph to explain when and how expensive work is done.

In order to use the internal string formatting of the logging module use percent formatting without the % operator but with seperate arguments to logging.debug and freinds:

logging.debug("Check %s and %s out.", "this", "that")
logging.error("The %(foo)s was %(bar)d", dict(foo="answer", bar=43))

Upvotes: 2

MyNameIsCaleb
MyNameIsCaleb

Reputation: 4489

If you want to check current level for yourself, you can get the level by using either getEffectiveLevel() docs or isEnabledFor() docs (the latter is probably what you want to use in the case of checking over and over, though probably you should cache the result for best case performance).

Python internally will check in a similar fashion at a very early stage of the logging process and will stop the statement from being logged if needed.

Depending on how intense creating the logging statement is for you (i.e. more than string formatting) then it's probably most performant to let Python internally choose to log or not versus manually checking before creating a simple string. If you were going to do some more intense thing that required many operations, then maybe checking first from a cache would be best.

Upvotes: 0

Related Questions