jason m
jason m

Reputation: 6835

airflow.exceptions.AirflowException: dag_id could not be found: sample_dag. Either the dag did not exist or it failed to parse

I have airflow installed.

airflow info yields:

Apache Airflow
version                | 2.2.0                                                
executor               | SequentialExecutor                                   
task_logging_handler   | airflow.utils.log.file_task_handler.FileTaskHandler  
sql_alchemy_conn       | sqlite:////home/user@local/airflow/airflow.db
dags_folder            | /home/user@local/airflow/dags                
plugins_folder         | /home/user@local/airflow/plugins             
base_log_folder        | /home/user@local/airflow/logs                
remote_base_log_folder |                                                      
                                                                              

System info
OS              | Linux                                                                                  
architecture    | x86_64                                                                                 
uname           | uname_result(system='Linux', node='ubuntuVM.local', release='5.11.0-37-generic',   
                | version='#41~20.04.2-Ubuntu SMP Fri Sep 24 09:06:38 UTC 2021', machine='x86_64')       
locale          | ('en_US', 'UTF-8')                                                                     
python_version  | 3.9.7 (default, Sep 16 2021, 13:09:58)  [GCC 7.5.0]                                    
python_location | /home/user@local/anaconda3/envs/airflow/bin/python                             
                                                                                                         

Tools info
git             | git version 2.25.1                                                                     
ssh             | OpenSSH_8.2p1 Ubuntu-4ubuntu0.3, OpenSSL 1.1.1f  31 Mar 2020                           
kubectl         | NOT AVAILABLE                                                                          
gcloud          | NOT AVAILABLE                                                                          
cloud_sql_proxy | NOT AVAILABLE                                                                          
mysql           | NOT AVAILABLE                                                                          
sqlite3         | 3.36.0 2021-06-18 18:36:39                                                             
                | 5c9a6c06871cb9fe42814af9c039eb6da5427a6ec28f187af7ebfb62eafa66e5                       
psql            | NOT AVAILABLE                                                                          
                                                                                                         

Paths info
airflow_home    | /home/user@local/airflow                                                       
system_path     | /home/user@local/anaconda3/envs/airflow/bin:/home/user@local/anaconda3/
                | condabin:/sbin:/bin:/usr/bin:/usr/local/bin:/snap/bin                                  
python_path     | /home/user@local/anaconda3/envs/airflow/bin:/home/user@local/anaconda3/
                | envs/airflow/lib/python39.zip:/home/user@local/anaconda3/envs/airflow/lib/pytho
                | n3.9:/home/user@local/anaconda3/envs/airflow/lib/python3.9/lib-dynload:/home/jm
                | ellone@local/anaconda3/envs/airflow/lib/python3.9/site-packages:/home/user@ocp.
                | local/airflow/dags:/home/user@local/airflow/config:/home/user@local/air
                | flow/plugins                                                                           
airflow_on_path | True                                                                                   
                                                                                                         

Providers info
apache-airflow-providers-celery | 2.1.0
apache-airflow-providers-ftp    | 2.0.1
apache-airflow-providers-http   | 2.0.1
apache-airflow-providers-imap   | 2.0.1
apache-airflow-providers-sqlite | 2.0.1

I then cd to /home/user@local/airflow/dags and touch to create a file sample_dag.py.

Next, I ran:

airflow dags backfill sample_dag

But airflow throws:

airflow.exceptions.AirflowException: dag_id could not be found: sample_dag.py. Either the dag did not exist or it failed to parse.

Further, I do not see my dag in my localhost:8080 page but I do see the sample DAGs.

I have no reason to think a blank py file should work but I would think it should be seen.

How do I go about creating my first DAG? Judging by the docs I've read, this should be correct.

Upvotes: 0

Views: 498

Answers (1)

Josh Fell
Josh Fell

Reputation: 3589

The exception is thrown because there are no DAGs with a dag_id of "sample_dag" in the dags_folder location. The dag_id is set when calling the DAG constructor rather than referencing the name of the DAG file.

For example:

with DAG(
    dag_id='hello_world',
    schedule_interval="@daily",
    start_date=datetime(2021, 1, 1),
    catchup=False,
    tags=['example'],
) as dag:
...

An empty DAG file will not be recognized by Airflow nor will an empty DAG be created in the UI.

To get going with your first DAG, you can check out the classic tutorial, the TaskFlow API tutorial, or dive into one of the sample DAGs that were loaded initially via the Code View in the Airflow UI.

Upvotes: 1

Related Questions