mad_
mad_

Reputation: 8273

How to get logs from inside the container executed using DockerOperator?(Airflow)

I'm facing logging issues with DockerOperator.

I'm running a python script inside the docker container using DockerOperator and I need airflow to spit out the logs from the python script running inside the container. Airlfow is marking the job as success but the script inside the container is failing and I have no clue of what is going as I cannot see the logs properly. Is there way to set up logging for DockerOpertor apart from setting up tty option to True as suggested in docs

Upvotes: 4

Views: 3351

Answers (2)

tsveti_iko
tsveti_iko

Reputation: 7982

Instead of DockerOperator you can use client.containers.run and then do the following:

with DAG(dag_id='dag_1',
    default_args=default_args,
    schedule_interval=None,
    tags=['my_dags']) as dag:

        @task(task_id='task_1')
        def start_task(**kwargs):

            # get the docker params from the environment
            client = docker.from_env()
              
            # run the container
            response = client.containers.run(

                # The container you wish to call
                image='__container__:latest',

                # The command to run inside the container
                command="python test.py",

                version='auto',
                auto_remove=True,
                stdout = True,
                stderr=True,
                tty=True,
                detach=True, 
                remove=True,

                ipc_mode='host',

                network_mode='bridge',

                # Passing the GPU access
                device_requests=[
                    docker.types.DeviceRequest(count=-1, capabilities=[['gpu']])
                ],

                # Give the proper system volume mount point
                volumes=[
                    'src:/src',
                ],

                working_dir='/src'
            )

            output = response.attach(stdout=True, stream=True, logs=True)
            for line in output:
                print(line.decode())

            return str(response)

        test = start_task()

Then in your test.py script (in the docker container) you have to do the logging using the standard Python logging module:

import logging
logger = logging.getLogger("airflow.task")
logger.info("Log something.")

Reference: here

Upvotes: 0

Daniel Huang
Daniel Huang

Reputation: 6548

It looks like you can have logs pushed to XComs, but it's off by default. First, you need to pass xcom_push=True for it to at least start sending the last line of output to XCom. Then additionally, you can pass xcom_all=True to send all output to XCom, not just the first line.

Perhaps not the most convenient place to put debug information, but it's pretty accessible in the UI at least either in the XCom tab when you click into a task or there's a page you can list and filter XComs (under Browse).

Source: https://github.com/apache/airflow/blob/1.10.10/airflow/operators/docker_operator.py#L112-L117 and https://github.com/apache/airflow/blob/1.10.10/airflow/operators/docker_operator.py#L248-L250

Upvotes: 1

Related Questions