Reputation: 47
I'm trying to create a simple DAG workflow on Apache Airflow where data is updated to my localhost PostgreSQL database.
Can someone please tell me why my DAG fails and I get these errors in the logs ONLY when I use CeleryExecutor? I tried running the same DAG using LocalExecutor and I didn't get any errors and it run smoothly.
This is the log errors:
*** Log file isn't local.
*** Fetching here: http://<worker hostname>:8793/log/PDI_Incr_20190407_v2/checkBCWatermarkDt/2019-04-07T17:00:00/1.log
*** Failed to fetch log file from worker. 404 Client Error: NOT FOUND for url: http://<worker hostname>:8793/log/PDI_Incr_20190407_v2/checkBCWatermarkDt/2019-04-07T17:00:00/1.log
Thank you for your help!
Upvotes: 2
Views: 6036
Reputation: 3313
This has to do with a missing hostname resolution. The fix depends on whether the webserver is running in a Docker container.
Modify etc/hosts
(assuming Linux) by adding the hostname resolution:
# etc/hosts
...
192.168.xxx.yyy airflow-worker0
192.168.xxx.zzz airflow-worker1
Assuming a Docker Compose setup, pass the hostname resolution as extra_hosts for the webserver service:
# docker-compose.yml
version: "3.9"
services:
webserver:
extra_hosts:
- "airflow-worker0:192.168.xxx.yyy"
- "airflow-worker1:192.168.xxx.zzz"
...
...
Upvotes: 0
Reputation: 26
Best solution
If you want to see logs on your web, you need to configure the hostname mapping of the /etc/hosts of the worker node, mapping the ip of the worker node to the hostname of the machine:
10.xxx.xxx.xxx hostname
And your request after that:
http://hostname.pl:8793/log/..
Fast solution
If you don't want to do that you can see logs in your worker node airflow/logs/{dagName}/{taskName}/{executionTimestamp}/log.txt
In your case it will be airflow/logs/PDI_Incr_20190407_v2/checkBCWatermarkDt/2019-04-07T17:00:00/1.log
Upvotes: 1