raaj
raaj

Reputation: 483

How to fix "Failed to fetch log file from worker. Unsupported URL protocol" error in Airflow DAG logs?

I am running Airflow via docker through this image apache/airflow:2.1.0 Please refer to this thread for the initial error I faced.

Currently I am able to run my previous existing DAGs. However when I add newer DAGS I get the following error in the log file. I am pretty sure it is not an issue with memory or compute.

*** Log file does not exist: /opt/airflow/logs/my-task/my-task/2021-06-15T14:11:33.254428+00:00/1.log
*** Fetching from: http://:8793/log/my-task/my-task/2021-06-15T14:11:33.254428+00:00/1.log
*** Failed to fetch log file from worker. Unsupported URL protocol ''

Things I have tried already:

Upvotes: 10

Views: 26270

Answers (7)

Turbo
Turbo

Reputation: 1

it worked for me while i gave write permission to logs directory

Upvotes: 0

JDune
JDune

Reputation: 577

For me this error was caused by a syntax error in one of my custom operators that I was actively working on. I didn't see the DAG parse error so was stuck for a while. Once the syntax error was fixed the error went away. Silly mistake but hopefully that helps someone

Upvotes: 0

user3029620
user3029620

Reputation: 1537

my airflow was installed at /var/airflow folder and I just gave write permission - sudo chmod -R 777 /var/airflow/ stop container (docker-compose down) and restart the docker service - sudo systemctl restart docker

Upvotes: 0

zakaev
zakaev

Reputation: 11

If you are using kind one more way to fix it:

Fist of all, get configuration file by typing:

helm show values apache-airflow/airflow > values.yaml 

After that check that fixPermissions is true.

persistence:
  # Enable persistent volumes
  enabled: true
  # Volume size for worker StatefulSet
  size: 10Gi
  # If using a custom storageClass, pass name ref to all statefulSets here
  storageClassName:
  # Execute init container to chown log directory.
  # This is currently only needed in kind, due to usage
  # of local-path provisioner.
  fixPermissions: true

Update your installation by:

helm upgrade --install airflow apache-airflow/airflow -n airflow -f values.yaml --debug

Upvotes: 1

MrBronson
MrBronson

Reputation: 612

I had the same problem. For me the cause of task failure at the beginning of run was that my worker didn't have the write permissions on the mounted logs directory(ro mount on shared drive). Once I fixed that everything started to work.

Upvotes: 2

Maltita
Maltita

Reputation: 79

I don't have the solution for this, but I have a clue.

Apparently, the problem is a bug that prevents Airflow to store the log if the task didn't even get to run, as you know already.

So, something that is not a syntax error is causing an error. In my case, I'm 80% sure is about Airflow not picking the right path to my config and utils folders, so, first thing the task does is try to use functions and credentials stored in that folders and not being able, so inmediately crashes, before being able to store some logs. Probably I can do something about it on the yaml file.

BTW yesterday I saw your question across multiple platforms without any answer and I want to tell you that my soul resonated with yours on this crusade to make the godforsaken Airflow DAG work. I feel you, bro.

Upvotes: 7

shock_in_sneakers
shock_in_sneakers

Reputation: 390

The same problem here. I'm using CeleryExecutor in K8S cluster. Each component is running as independent pod(under deployment). My first thought: It can be related to lack off mounted volumes(with files). I'll try to mount PVC and give info if it's works

Upvotes: 1

Related Questions