Luke_V
Luke_V

Reputation: 79

Airflow DockerOperator, how to add a volume to a container

I have a serie of tasks that is supposed to spin up containers and run a python script in them. I however need to mount the volume which contains the python code...

After running into many errors I found out adding a volume is not supported anymore, and now a Mount object has to be specified: https://github.com/apache/airflow/pull/15843

My code looks like this for the task:

from airflow.operators.docker_operator import DockerOperator
from docker.types import Mount

code_dir = Mount(target='/SRC',
                     source='/SRC/code',
                     type='bind')

task_name = DockerOperator(
            task_id=f"task_{task_name}",
            image='python-multi-purpose:latest',
            container_name=task_name,
            mount=[code_dir],
            api_version='auto',
            auto_remove=True,
            command=command,
            docker_url="unix://var/run/docker.sock",
            network_mode="bridge",
            dag=dag
        )

Unfortunately I am running in the same exact error I was receiving when using volumes=[list]:

airflow.exceptions.AirflowException: Invalid arguments were passed to DockerOperator (task_id: task_yh_get_info.A). Invalid arguments were:
**kwargs: {'mount': [{'Target': '/SRC', 'Source': '/SRC/code', 'Type': 'bind', 'ReadOnly': False}]}

Would anyone be able to provide the syntax and a logical explanation of how to make this work?

Alternatively, any suggestion on how to approach this? keeping in mind I have Airflow running inside a container

Thanks!

Upvotes: 0

Views: 2505

Answers (2)

Jarek Potiuk
Jarek Potiuk

Reputation: 20077

OK. Another answer then (if you have the latest docker provider):

mount -> mounts.

You have typo in parameter name: https://airflow.apache.org/docs/apache-airflow-providers-docker/stable/_api/airflow/providers/docker/operators/docker/index.html

Upvotes: 1

Jarek Potiuk
Jarek Potiuk

Reputation: 20077

I think the main problem you had that there was a bug in Docker Provider 2.0.0 which prevented Docker Operator to run with Docker-In-Docker solution. It's been solved in 2.1.0

You need to upgrade to the latest Docker Provider 2.1.0 https://airflow.apache.org/docs/apache-airflow-providers-docker/stable/index.html#id1

You can do it by extending the image as described in https://airflow.apache.org/docs/docker-stack/build.html#extending-the-image with - for example - this docker file:

FROM apache/airflow
RUN pip install --no-cache-dir apache-airflow-providers-docker==2.1.0

The operator will work out-of-the-box in this case with "fallback" mode (and Warning message), but you can also disable the mount that causes the problem. More explanation from the https://airflow.apache.org/docs/apache-airflow-providers-docker/stable/_api/airflow/providers/docker/operators/docker/index.html

However if you have to remember though, that when you use docker-in-docker (and mount docker socket) the behaviour of "bind" mount might be not what you are expecting. It will actually mount the "host" volumes to the new container run by DockerOperator, it will not mount folders from one container to the other. But I believe this is what you wanted to do anyway - if that's the case then the new version of Docker Provider should fix your problems.

Also - if that still does not solve your problem you can always downgrade the DockerOperator to previous version before the "volume/mount" change.

Upvotes: 1

Related Questions