Reputation: 343
The scheduler and the webserver are being run on different containers and when I run a DAG and check the logs on the webserver, it shows me this particular error.
*** Log file does not exist: /usr/local/airflow/logs/indexing/index_articles/2019-12-31T00:00:00+00:00/1.log
*** Fetching from: http://465e0f4a4332:8793/log/indexing/index_articles/2019-12-31T00:00:00+00:00/1.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='465e0f4a4332', port=8793): Max retries exceeded with url: /log/indexing/index_articles/2019-12-31T00:00:00+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f0a143700d0>: Failed to establish a new connection: [Errno 111] Connection refused'))
I set the airflow variables as mentioned in this other similar question and the only variables that I'm changing on the cfg files are these.
AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@postgres:5432/airflow
AIRFLOW__CORE__LOAD_EXAMPLES=False
AIRFLOW__CORE__BASE_URL = http://{hostname}:8080
I manually checked and log files are being generated properly, I'm assuming the only problem is the url not being publically accessible through the webserver container. I'm not sure where I'm messing it up and I'm running and testing this in the local.
Upvotes: 20
Views: 23843
Reputation: 1
Solution: Set the flag "AIRFLOW__CORE__REMOTE_LOGGING=True" in the 'airflow.cfg' file
Upvotes: 0
Reputation: 367
The problem is because the docker containers do not share a filesystem. This is indicated by the first line of the response.
Airflow then falls back to attempting to fetch the log-file over HTTP, as indicted by the second line of the response. Other answers try to fix this by overriding the HOSTNAME_CALLABLE function, however this will not work unless the host is exposing the logfiles over HTTP.
The solution is to fix the first problem by mounting a shared volume.
In your docker-compose.yml file, add a new volume called logs-volume
.
volumes:
logs-volume:
Then, also in the docker-compose.yml file, add mount this volume to the required log directory, in your case /usr/local/airflow/logs/
, for each service:
services:
worker:
volumes:
- logs-volume:/usr/local/airflow/logs
webserver:
volumes:
- logs-volume:/usr/local/airflow/logs
Upvotes: 9
Reputation: 4873
I ran into the same issue while using docker-compose from Airflow with CeleryExecutor
. My problem was related to the fact that the container running the airflow webserver
command was unable to reach the celery worker
node running in a different machine.
I solved it by exposing the expected port in the worker node and adding a DNS entry in the main node where the webserver runs.
Celery Worker docker-compose file:
...
services:
airflow-worker:
<<: *airflow-common
hostname: worker_my_hostname
ports:
- 8793:8793
command: celery worker
restart: always
Main node docker-compose file section:
---
version: "3"
x-airflow-common: &airflow-common
extra_hosts:
- "worker_my_hostname:10.10.59.200"
...
Logs with original failure message:
Failed to fetch log file from worker. HTTPConnectionPool(host='worker_my_hostname', port=8793): Max retries exceeded with url: /log/dag_id/task_id/2021-05-14T20:24:49.433789+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f91cb1b7ac8>: Failed to establish a new connection: [Errno 111] Connection refused',))
Upvotes: 0
Reputation: 131
The worker's hostname is not being correctly resolved.
Add a file hostname_resolver.py
:
import os
import socket
import requests
def resolve():
"""
Resolves Airflow external hostname for accessing logs on a worker
"""
if 'AWS_REGION' in os.environ:
# Return EC2 instance hostname:
return requests.get(
'http://169.254.169.254/latest/meta-data/local-ipv4').text
# Use DNS request for finding out what's our external IP:
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.connect(('1.1.1.1', 53))
external_ip = s.getsockname()[0]
s.close()
return external_ip
And export: AIRFLOW__CORE__HOSTNAME_CALLABLE=airflow.hostname_resolver:resolve
Upvotes: 1