Reputation: 179
I am new to Airflow. I think I have read all of the articles in the Airflow documentation about scheduling, but I still cannot seem to get my DAGs running after the start_date+schedule_interval (i.e. no task instances). I am using docker. I am wondering it I am missing a command that is scheduling Dags, even though this wasn't the case when I was using the tutorial code.
Here is my dockerfile.
FROM ubuntu:latest
FROM python:3
RUN apt-get update -y
RUN apt-get install -y python-pip python-dev apt-utils build-essential
RUN pip install --upgrade pip
# Installs these and a few others
# mysqlclient==1.3.10
# airflow==1.7.1.3
COPY dependencies /dependencies
RUN pip install -r dependencies/requirements_loader.txt
COPY airflow /root/airflow
# Load other dependencies
# I have tried many different variation of these commands with no luck
CMD airflow webserver -p 8080
CMD airflow scheduler -d DAG_id
I am using the PythonOperator with a module that loads a library that I wrote. I am not sure if this is the right way for this, but it works airflow test dag_id execution_date
. This is what is most peculiar to me. The test works, but it never actually gets run when I start airflow. I am using the LocalExecutor. here is my dag.
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from datetime import datetime, timedelta
from my_lib import my_func
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': datetime(2017, 3, 6),
'email_on_failure': False,
'email_on_retry': False,
'email': ['[email protected]'],
'retries': 1,
'retry_delay': timedelta(minutes=5)
}
dag = DAG('dag_id', default_args=default_args, schedule_interval="31 2 * * *")
t1 = PythonOperator(
dag=dag,
task_id='run_my_func',
provide_context=False,
python_callable=my_func)
I have also messed around with the schedule interval and start date, including dates that started a month ago with a @daily
interval. None of these have given my any luck.
What is really perplexing is that when I test a dag it works, but it is not getting scheduled and creating any task_instances after it is deployed.
Can anyone point me in the right direction for a deployment that does properly schedule the dags? What I am doing wrong?
Upvotes: 0
Views: 1135
Reputation: 179
The problem is that you cannot use two CMD commands at the end of a dockerfile. Once I created two dockerfiles with docker-compose, it worked fine.
There can only be one CMD instruction in a Dockerfile. If you list more than one CMD then only the last CMD will take effect.
https://docs.docker.com/engine/reference/builder/#cmd
Upvotes: 1