Matt
Matt

Reputation: 1632

Google Cloud Composer (Apache Airflow) cannot access log files

I'm running a DAG in Google Cloud Composer (hosted Airflow) which runs fine in Airflow locally. All it does is print "Hello World". However, when I run it through Cloud Composer I receive the error:

*** Log file does not exist: /home/airflow/gcs/logs/matts_custom_dag/main_test/2020-04-20T23:46:53.652833+00:00/2.log
*** Fetching from: http://airflow-worker-d775d7cdd-tmzj9:8793/log/matts_custom_dag/main_test/2020-04-20T23:46:53.652833+00:00/2.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='airflow-worker-d775d7cdd-tmzj9', port=8793): Max retries exceeded with url: /log/matts_custom_dag/main_test/2020-04-20T23:46:53.652833+00:00/2.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8825920160>: Failed to establish a new connection: [Errno -2] Name or service not known',))

I've also tried making the DAG add data into a database and it actually succeeds 50% of the time. However, it always returns this error message (and no other print statements or logs). Any help much appreciated on why this might be happening.

Upvotes: 5

Views: 4422

Answers (5)

rafalbiegacz
rafalbiegacz

Reputation: 96

In general, the issue describe here should be more like a sporadic issue.

In certain situations, what could help is setting default-task-retries to a value that allows for retrying a task at least 1.

Upvotes: 0

blackjid
blackjid

Reputation: 1591

I have the same problem after upgrading from 1.10.3 to 1.10.6 of Google Composer. I can see in my logs that airflow is trying to get the logs from a bucket with a name ended with -tenant while the bucket in my account ends with -bucket

In the configuration, I can see something weird too.

## airflow.cfg
[core]
remote_base_log_folder = gs://us-east1-dada-airflow-xxxxx-bucket/logs

## also in the running configuration says
core    remote_base_log_folder  gs://us-east1-dada-airflow-xxxxx-tenant/logs   env var

I wrote to google support and they said the team is working on a fix.

EDIT: I've been accessing my logs with gsutil and replacing the bucket name suffix to -bucket

gsutil cat gs://us-east1-dada-airflow-xxxxx-bucket/logs/...../5.logs

Upvotes: 4

benjo53
benjo53

Reputation: 9

This issue is resolved at least since Airflow version: 1.10.10+composer.

Upvotes: -1

Nandakishore
Nandakishore

Reputation: 1011

I faced the same situation in multiple occasions. As soon as when the job finished when I take a look at the log on Airflow Web UI, it used to give me the same error. Although when I check back the same logs on UI after a min or 2, I could see the logs properly. As per the above answers, its a sync issue between the webserver and the Worker node.

Upvotes: 1

SANN3
SANN3

Reputation: 10069

We also faced the same issue then raised a support ticket to GCP and got the following reply.

  1. The message is related to the latency of syncing logs from Airflow workers to WebServer, it takes at least some minutes (depending on the number of objects and their size) The total log size seems not large but it’s enough to noticeably slow down synchronization, hence, we recommend cleanup/archive the logs

  2. Basically we recommend relying on Stackdriver logs instead, because of latency due to the design of this sync

I hope this will help you solve the problem.

Upvotes: 4

Related Questions