Reputation: 27107
I have a third party python console script, which source I don't want to modify.
But I want to configure the logging which is done by the script and its libraries. The script uses the standard python logging, but does not support configuration of it.
The script uses this pattern:
import logging
logger=logging.getLogger(__name__)
Use cases:
How can I configure the logging, if I don't want to modify the sources of the console script?
The script gets called via cron
.
How can I configure the logging if this script?
Creating a wrapper script like in this answer is not a solution for me.
The linux process hierarchy looks like this:
Cron -> third_party_script
There should be any "glue", "wrapping" or "dirty-hack" script between cron and third_party_script
.
I want to practice "separation of concerns". I want to be able to configure logging one time and in one place. This configuration should get used by all python code of a virtualenv. Writing a wrapper would be a work-around. I want a solution.
Several months later I think a pth file would an simple answer.
Upvotes: 22
Views: 8695
Reputation: 2057
In a nutshell, what we want to do is to inject code that gets executed by the python interpreter before our main code gets executed.
The best way to achieve this is to create a virtualenv and add
sitecustomize.py
in the virtualenv's site-packages.
Let's assume that the application we want to run is called my_app.py
and that its
logger has the same name.
$ cat my_app.py
import logging
logger = logging.getLogger("my_app")
logger.debug("A debug message")
logger.info("An info message")
logger.warning("A warning message")
logger.error("An error message")
Running my_app.py
should only show the messages whose severity is > WARNING
(which
is the default behavior in python's logging).
$ python my_app.py
A warning message
An error message
Now let's create a virtual environment
python3 -m venv my_venv
And let's add sitecustomize.py
to the virtualenv's site packages.
$ cat my_venv/lib/python3.7/site-packages/sitecustomize.py
import logging
# Setup logging for my_app
# We will only setup a console handler
logger = logging.getLogger("my_app")
logger.setLevel(logging.DEBUG)
ch = logging.StreamHandler()
ch.setFormatter(
logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
)
logger.addHandler(ch)
Now let's try to run my_app.py
using the virtualenv:
$ ./my_venv/bin/python my_app.py
2019-01-25 16:03:16,815 - my_app - DEBUG - A debug message
2019-01-25 16:03:16,815 - my_app - INFO - An info message
2019-01-25 16:03:16,815 - my_app - WARNING - A warning message
2019-01-25 16:03:16,815 - my_app - ERROR - An error message
And that was all)
We got proper logging without having to modify my_app.py
or write a wrapper!
Now if you want to know why this is the optimal approach, keep on reading.
Before understanding why using a virtualenv + sitecustomize.py
is the correct approach
to this problem, we need to make a not so short introduction.
Note: I am going to assume that you create virtual environments using the
venv
module which uses
stdlib's site.py
. The
virtuaelnv
library, uses its own
site.py
and might be doing things slightly differently. Nevertheless, after reading this answer
you should be able to examine if there are any differences between venv
and
vitualenv
and understand how to handle them.
site-packages
The short answer is that site-packages
is the place where python installs 3rd party
code (as in non-stdlib code). For more info read
this
and the provided links.
Python does allow you to customize the python interpreter while it starts up, i.e. before our main code/script/whatever gets executed. This can be useful for e.g.:
The way you achieve the injection is by creating/modyfing either sitecustomize.py
or
usercustomize.py
. You can also use a "path configuration file" (i.e. *.pth
) file
with an import statement but I will not cover this case here since:
sitecustomize
/usercustomize
.Anyway, if you need more info WRT to path configuration files you can check PyMOTW and if you want an example of using them with import statement check this blog post.
sitecustomize
& usercustomize
So, sitecustomize
and usercustomize
are special files that don't exist by default,
but if we create them python will automatically import them before it starts to execute
our code. We can create these files:
/usr/lib/python3.7/site-packages/
)~/.local/lib/python3.7/site-packages/
)sitecustomize
is always imported before usercustomize
. If either file is missing,
the ImportError
is silently ignored.
As a security precaution, if there is a mismatch between user or group id and the effective id, then user site-packages are disabled (source). Moreover, the python interpreter has CLI arguments that either completely disable site-packages (both system and user ones) or disable user site-packages. Assuming that we don't have an id mismatch and that we don't use any CLI flags, then user site-packages have higher priority than system site-packages. So if we have both:
~/.local/lib/python3.7/site-packages/sitecustomize.py
/usr/lib/python3.7/site-packages/sitecustomize.py
the first one is the one that will be imported. We can actually check the sys.path
priority by executing the site.py
module:
$ python3 -msite
sys.path = [
'/tmp/test',
'/usr/lib/python37.zip',
'/usr/lib/python3.7',
'/usr/lib/python3.7/lib-dynload',
'/home/username/.local/lib/python3.7/site-packages', # user site-packages
'/usr/lib/python3.7/site-packages', # system site-packages
]
USER_BASE: '/home/username/.local' (exists)
USER_SITE: '/home/username/.local/lib/python3.7/site-packages' (exists)
ENABLE_USER_SITE: True
The important information here is the value of ENABLE_USER_SITE
. If it is True
then
user site-packages are enabled. If it False
then we can only use global site-packages.
E.g. if we use python -s
:
$ python3 -s -msite
sys.path = [
'/tmp/test',
'/usr/lib/python37.zip',
'/usr/lib/python3.7',
'/usr/lib/python3.7/lib-dynload',
'/usr/lib/python3.7/site-packages',
]
USER_BASE: '/home/username/.local' (exists)
USER_SITE: '/home/username/.local/lib/python3.7/site-packages' (exists)
ENABLE_USER_SITE: False
Note that in this case ENABLE_USER_SITE
is False
.
Just for completeness, let's completely disable the site-packages:
$ python3 -S -msite
sys.path = [
'/tmp/test',
'/usr/lib/python37.zip',
'/usr/lib/python3.7',
'/usr/lib/python3.7/lib-dynload',
]
USER_BASE: '/home/username/.local' (exists)
USER_SITE: '/home/username/.local/lib/python3.7/site-packages' (exists)
ENABLE_USER_SITE: None
In order to better understand this, let's do an experiment. First let's create
usercustomize
sitecustomize
modules in both system and user site-packages.
WARNING: We will be creating files in system site-packages. This will be interfering with your python distribution. Be careful and REMEMBER to remove them when we are done.
# system site packages
echo 'print(f"-> {__file__}")' | sudo tee /usr/lib/python3.7/site-packages/usercustomize.py
echo 'print(f"-> {__file__}")' | sudo tee /usr/lib/python3.7/site-packages/sitecustomize.py
# user site packages
echo 'print(f"-> {__file__}")' | tee ~/.local/lib/python3.7/site-packages/usercustomize.py
echo 'print(f"-> {__file__}")' | tee ~/.local/lib/python3.7/site-packages/sitecustomize.py
Let's also create a python module:
echo 'print("Inside foo")' | tee foo.py
Now let's execute foo.py
:
$ python3 foo.py
-> /home/username/.local/lib/python3.7/site-packages/sitecustomize.py
-> /home/username/.local/lib/python3.7/site-packages/usercustomize.py
Inside foo
As we can see:
sitecustomize
and usercustomize
are being importedNow what will happen if we disable user site-packages?
$ python3 -s foo.py
-> /usr/lib/python3.7/site-packages/sitecustomize.py
Inside foo
This time we see that we:
sitecustomize
is being imported. Even though usercustomize
exists in
system site-packages python does not import it! This is important! Keep it mind for when
we discuss virtualenvs! (TIP: this has to do with ENABLE_USER_SITE
; do you remember
what value does it have in this case?)sitecustomize
is being imported from system site-packagesFinally, if we completely disable site-packages, obviously usercustomize
and
sitecustomize
will be ignored:
$ python3 -S foo.py
Inside foo
OK, now let's bring virtualenv's into the game too. There are two types of virtualenvs:
--system-site-packages
.Let's create virtualenvs of both types
python3 -mvenv venv_no_system
python3 -mvenv --system-site-packages venv_system
Let's also create sitecustomize.py
and usercustomize.py
modules into the
virtualenv's site-packages:
echo 'print(f"-> {__file__}")' | tee ./venv_no_system/lib/python3.7/site-packages/usercustomize.py
echo 'print(f"-> {__file__}")' | tee ./venv_no_system/lib/python3.7/site-packages/sitecustomize.py
echo 'print(f"-> {__file__}")' | tee ./venv_system/lib/python3.7/site-packages/usercustomize.py
echo 'print(f"-> {__file__}")' | tee ./venv_system/lib/python3.7/site-packages/sitecustomize.py
and let's see the differences:
$ ./venv_no_system/bin/python -msite
/tmp/test/venv_no_system/lib/python3.7/site-packages/sitecustomize.py
sys.path = [
'/tmp/test',
'/usr/lib/python37.zip',
'/usr/lib/python3.7',
'/usr/lib/python3.7/lib-dynload',
'/tmp/test/venv_no_system/lib/python3.7/site-packages',
]
USER_BASE: '/home/username/.local' (exists)
USER_SITE: '/home/username/.local/lib/python3.7/site-packages' (exists)
ENABLE_USER_SITE: False
What do we see here? That on normal virtualenv's ENABLE_USER_SITE
is False
,
which means that:
sitecustomize
is being imported! I.e. we can't use usercustomize
to
inject code!!!We can also see that instead of our global site-packages, the virtualenv is using its
own one (i.e. /tmp/test/venv_no_system/lib/python3.7/site-packages
).
Now let's repeat this but this time with the virtualenv that uses the system site-packages:
$ ./venv_system/bin/python -msite
-> /home/username/.local/lib/python3.7/site-packages/sitecustomize.py
-> /home/username/.local/lib/python3.7/site-packages/usercustomize.py
sys.path = [
'/tmp/test',
'/usr/lib/python37.zip',
'/usr/lib/python3.7',
'/usr/lib/python3.7/lib-dynload',
'/tmp/test/venv_system/lib/python3.7/site-packages',
'/home/username/.local/lib/python3.7/site-packages',
'/usr/lib/python3.7/site-packages',
]
USER_BASE: '/home/username/.local' (exists)
USER_SITE: '/home/username/.local/lib/python3.7/site-packages' (exists)
ENABLE_USER_SITE: True
In this case, the behavior is different...
ENABLE_USER_SITE
is True
which means:
usercustomize
is being imported normally.But there is also one more difference. In this case we have 3 site-packages directories. The virtualenv one is the one with the higher priority followed by the user site-packages while the system site-packages is the last one.
I think that there are three options:
I think that in the majority of the cases, e.g. on normal servers/desktops, modifying the system python installation should generally be avoided. At least on *nix, too many things depend on Python. I would be really reluctant to change its behavior. Possible exceptions are ephemeral or static "systems" (e.g. inside a container).
As far as virtual environments go, unless we know we are going to need system site-packages, I think it makes sense to stick with the current practise and use normal ones. If we adhere to this, then in order to inject code before our script gets executed we only have one option:
To add a
sitecustomize.py
in the virtuaelenv's site packages.
# remove the virtualenvs
rm -rf my_venv
rm -rf venv_system
rm -rf venv_no_system
# remove our usercustomize.py and sitecustomize.py
sudo rm /usr/lib/python3.7/site-packages/sitecustomize.py
sudo rm /usr/lib/python3.7/site-packages/usercustomize.py
rm ~/.local/lib/python3.7/site-packages/sitecustomize.py
rm ~/.local/lib/python3.7/site-packages/usercustomize.py
# remove the modules
rm foo.py
rm my_app.py
Upvotes: 5
Reputation: 4989
Several possibilities:
Wrapper
If you can edit your cron table you could create a small script in python that get the lib logger, remove the existing log handler and hook your custom handler on it:
# Assumes the lib defines a logger object
from third_party_lib import *
# Note this assumes only one handler was defined by the lib
logger.removeHandler(logger.handlers[0])
# Then we can hook our custom format handler
custom_handler = logging.StreamHandler(sys.stdout)
custom_handler.setFormatter(logging.Formatter(format = '%(asctime)s %(levelname)s %(name)s %(process)d: %(message)s', None))
logger.addHandler(custom_handler)
logger.setLevel(logging.WARNING)
Also keep in mind this suppose the lib does not re-declare the logger on the way.
Dynamic code edit
If you do not have the possibility to modify the cron call, you might be able to do dynamic code edit but that is equivalent to editing the file by hand (hacky):
Upvotes: 5
Reputation: 27107
I asked this question several months ago. Unfortunately I got no answer which satisfied me.
The distinction between using logging and setting it up is important for me.
This is my solution: In our context we set up logging in a method which gets called in usercustomize.py
.
This way the optional plugins can use the logging without the need to set it up.
This almost solved all my needs.
Up to now I found no better way than usercustomize.py
. My perfect solution would be something I would call virtualenvcustomize.py
: Some initialization code which gets run if the interpreter loads virtualenv. Up to now I could not find such a hook. Please let me know if you have a solution.
Upvotes: 7
Reputation: 99490
A library isn't supposed to configure logging - that's up to the application developer. Inbar Rose's answer isn't quite right. If the module you're referring to is called foo
, then the reference to __name__
in its getLogger
call will be passing in foo
. So in your configuration code, you would need to do the equivalent of
logging.getLogger('foo').setLevel(logging.WARNING)
To include the PID in the logs, just ensure that you use an appropriate format string for your Formatters, i.e. one which includes %(process)d
. A simple example would be:
logging.basicConfig(format='%(process)d %(message)s')
Note that you can't write to the same log file from multiple processes concurrently - you may need to consider an alternative approach if you want to do this.
Update: An application developer is someone who writes Python code which is not the library, but is invoked by e.g. a user or another script via a command line or other means of creating a Python process.
To use the code I posted above, there is no need to wrap or modify the third-party code, as long as it's a library. For example, in the main script which invokes the third-party library:
if __name__ == '__main__':
# configure logging here
# sets the third party's logger to do WARNING or greater
# replace 'foo' with whatever the top-level package name your
# third party package uses
logging.getLogger('foo').setLevel(logging.WARNING)
# set any other loggers to use INFO or greater,
# unless otherwise configured explicitly
logging.basicConfig(level=logging.INFO, format='%(process)d %(message)s')
# now call the main function (or else inline code here)
main()
If the third party code runs via cron, it's not library code - it's an application, and you are probably out of luck.
Upvotes: 8
Reputation: 43497
You can change the minimum log level of that logger.
logging.getLogger(__name__).setLevel(logging.WARNING)
Now, only WARNING and above will be shown. No INFO and no DEBUG.
Addionally, you could also change the format. %(process)d
being the PID.
log_format = logging.Formatter('%(asctime)s %(levelname)s %(name)s %(process)d: %(message)s', '%H:%M:%S')
logging.getLogger(__name__).setFormatter(log_format)
All together:
log_format = logging.Formatter('%(asctime)s %(levelname)s %(name)s %(process)d: %(message)s', '%H:%M:%S')
log_handle = logging.getLogger(__name__)
log_handle.setLevel(logging.WARNING)
log_handle.setFormatter(log_format)
Note: You should replace __name__
in your code with the relevant log handler.
Upvotes: 2