Reputation: 14977
I would usually create a global logger
in my python script and then write my own messages in the log in all my functions. I know Airflow has its own logging function, but I find them too verbose to go through. Is there a way to create my own logging function for all tasks, such that the log only contains my custom log messages?
Upvotes: 2
Views: 6333
Reputation: 20047
Airflow uses absolutely standard Python logging functionality that has all that you need. I recommend you to not reinvent the wheel but use what is there
See https://docs.python.org/3/library/logging.html
The way how you should approach you should create your own named logger and log your messages there, and configure the logger to do whatever needs to be done with it (print to file, send it to cloudwatch or stackdriver, etc.)
Typically loggers in Python (this is not "Airflow" approach - this is absolutely standard approach all modern python programs follow) is that loggers are created with the naming following python packages.
This is what you will typically see in Python modules:
logger = logging.getLogger(__name__)
Loggers in Python are hierarchical where tree hierarchy is separated by "." so this has the nice advantage that you can configure logger properties for "package + all subpackages recursively" - for example you could set logging for all "airflow" sub-packages by configuring "airflow" logger.
Again - this is nothing "special". All modern Python programs do that.
And you can also, if you want to configure your own logger not following the "package". This is more of a convention but it is not enforced.
So in your program, dag, whatever you could write:
my_logger = logging.getLogger("MY_CUSTOM_LOGGER")
and use all the goodies that come with loggers in Python: my_logger.info(...), warn(...), passing exceptions etc. etc.
Now, the key to that is "configuring" the output. Python logging (and again this is nothing "airflow special" is that you can configure where loggers print the output and how, completely independently from the place where loggers are defined - by providing logger configuration.
In Airflow (and this is the only "airflow-specific" part) there is a predefined configuration that you have (and it should remain as it is), but you can easily extend it with your own custom logging configuration.
It is described here: https://airflow.apache.org/docs/apache-airflow/stable/logging-monitoring/logging-tasks.html#advanced-configuration
The way you'd do it is to create a file called ${AIRFLOW_HOME}/config/log_config.py with following content:
from copy import deepcopy
from airflow.config_templates.airflow_local_settings import DEFAULT_LOGGING_CONFIG
LOGGING_CONFIG = deepcopy(DEFAULT_LOGGING_CONFIG)
# Here your custom configuration follows
LOGGING_CONFIG['formatters']['MY_CUSTOM_FORMATTER'] = {
...
"class": "logging.Formatter",
...
}
LOGGING_CONFIG['handlers']['MY_CUSTOM_HANDLER'] = {
..
"formatters" : "MY_CUSTOM_FORMATTER",
..
}
LOGGING_CONFIG['loggers']['MY_CUSTOM_LOGGER'] = {
"handlers" : "MY_CUSTOM_HANDLER",
"level": logging.INFO
}
Again - nothing "airlfow-special" except the need of deepcopying the original configuration to make sure Airflow tasks send logs in the way that webserver can see them, so that secrets are automatically masked and so on.
This is pure python standard approach - coming from standard python library.
Upvotes: 5