Reputation: 2741
I am new at using Apache Airflow. Some operators of my dag have a failed status. I am trying to understand the origin of the error.
Here are the details of the problem:
My dag is pretty big, and certain parts of it are composed of sub-dags.
What I notice in the Composer UI, is that the Subdags that failed, all did in a task_id
named download_file
that uses XCom
with a GoogleCloudStorageDownloadOperator
.
>> GoogleCloudStorageDownloadOperator(
task_id='download_file',
bucket="sftp_sef",
object="{{task_instance.xcom_pull(task_ids='find_file') | first }}",
filename="/home/airflow/gcs/data/zips/{{{{ds_nodash}}}}_{0}.zip".format(table)
)
The logs in the said Subdag do not show anything useful.
LOG :
[2020-04-07 15:19:25,618] {models.py:1359} INFO - Dependencies all met for [2020-04-07 15:19:25,660] {models.py:1359} INFO - Dependencies all met for [2020-04-07 15:19:25,660] {models.py:1577} INFO -
------------------------------------------------------------------------------- Starting attempt 10 of 1
[2020-04-07 15:19:25,685] {models.py:1599} INFO - Executing on 2020-04-06T11:44:31+00:00 [2020-04-07 15:19:25,685] {base_task_runner.py:118} INFO - Running: ['bash', '-c', 'airflow run datamart_integration.consentement_email download_file 2020-04-06T11:44:31+00:00 --job_id 156313 --pool integration --raw -sd DAGS_FOLDER/datamart/datamart_integration.py --cfg_path /tmp/tmpacazgnve']
I am not sure if there is somewhere I am not checking... Here are my questions :
Upvotes: 1
Views: 1596
Reputation: 4032
Regarding your three questions:
First, when using Cloud Composer you have several ways of debugging error in your code. According to the documentation, you should:
- Check the Airflow logs.
These logs are related to single DAG tasks. It is possible to view them in the Cloud Storage's logs folder and in the Web Airflow interface.
When you create a Cloud Composer environment a Cloud Storage Bucket is also created and associate with it. Thus, Cloud Composer stores the logs for single DAG tasks in the logs folder inside this bucket, each workflow folder has a folder for its DAGs and sub-DAGs. You can check its structure here.
Regarding the Airflow web interface, it is refreshed every 60 seconds.Also, you can check more about it here.
- Review the Google Cloud's operations suite.
You can use Cloud Monitoring and Cloud Logging with Cloud Composer. Whereas Cloud Monitoring provides visibility into the performance and overall health of cloud-powered applications, Cloud Logging shows the logs that the scheduler and worker containers produce. Therefore, you can use both or just the one you find more useful based on your need.
In the Cloud Console, check for errors on the pages for the Google Cloud components running your environment.
In the Airflow web interface, check in the DAG's Graph View for failed task instances.
Thus, these are the steps recommended when troubleshooting your DAG.
Second, regarding testing and debugging, it is recommended that you separate production and test environment to avoid DAG interference.
Furthermore, it is possible to test your DAG locally, there is a tutorial in the documentation about this topic, here. Testing locally allows you to identify syntax and task errors. However, I must point that it won't be possible to check/evaluate dependencies and communication to the database.
Third, in general, in order to verify errors in Xcom you should check:
I would like to point that, according to this documentation, the path to GoogleCloudStorageDownloadOperator was updated to GCSToLocalOperator.
In addition, I also encourage you to have a look at this: code and documentation to check Xcom syntax and errors.
Feel free to share the error code with me if you need further help.
Upvotes: 2