jhilliar
jhilliar

Reputation: 179

Airflow Docker Deployment: Task Not Getting Run After start_date + schedule_interval

I am new to Airflow. I think I have read all of the articles in the Airflow documentation about scheduling, but I still cannot seem to get my DAGs running after the start_date+schedule_interval (i.e. no task instances). I am using docker. I am wondering it I am missing a command that is scheduling Dags, even though this wasn't the case when I was using the tutorial code.

Here is my dockerfile.

FROM ubuntu:latest
FROM python:3

RUN apt-get update -y
RUN apt-get install -y python-pip python-dev apt-utils build-essential
RUN pip install --upgrade pip

# Installs these and a few others
# mysqlclient==1.3.10
# airflow==1.7.1.3
COPY dependencies /dependencies
RUN pip install -r dependencies/requirements_loader.txt

COPY airflow /root/airflow

# Load other dependencies 


# I have tried many different variation of these commands with no luck
CMD airflow webserver -p 8080
CMD airflow scheduler -d DAG_id

I am using the PythonOperator with a module that loads a library that I wrote. I am not sure if this is the right way for this, but it works airflow test dag_id execution_date. This is what is most peculiar to me. The test works, but it never actually gets run when I start airflow. I am using the LocalExecutor. here is my dag.

from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from datetime import datetime, timedelta
from my_lib import my_func

default_args = {
    'owner': 'airflow',
    'depends_on_past': False,
    'start_date': datetime(2017, 3, 6),
    'email_on_failure': False,
    'email_on_retry': False,
    'email': ['[email protected]'],
    'retries': 1,
    'retry_delay': timedelta(minutes=5)
}


dag = DAG('dag_id', default_args=default_args, schedule_interval="31 2 * * *")

t1 = PythonOperator(
    dag=dag,
    task_id='run_my_func',
    provide_context=False,
    python_callable=my_func)

I have also messed around with the schedule interval and start date, including dates that started a month ago with a @daily interval. None of these have given my any luck.

What is really perplexing is that when I test a dag it works, but it is not getting scheduled and creating any task_instances after it is deployed.

Can anyone point me in the right direction for a deployment that does properly schedule the dags? What I am doing wrong?

Upvotes: 0

Views: 1135

Answers (1)

jhilliar
jhilliar

Reputation: 179

The problem is that you cannot use two CMD commands at the end of a dockerfile. Once I created two dockerfiles with docker-compose, it worked fine.

There can only be one CMD instruction in a Dockerfile. If you list more than one CMD then only the last CMD will take effect.

https://docs.docker.com/engine/reference/builder/#cmd

Upvotes: 1

Related Questions