Yuzobra
Yuzobra

Reputation: 61

Airflow on kubernetes cannot fetch logs

My airflow service runs as a kubernetes deployment, and has two containers, one for the webserver and one for the scheduler. I'm running a task using a KubernetesPodOperator, with in_cluster=True parameters, and it runs well, I can even kubectl logs pod-name and all the logs show up.

However, the airflow-webserver is unable to fetch the logs:

*** Log file does not exist: /tmp/logs/dag_name/task_name/2020-05-19T23:17:33.455051+00:00/1.log
*** Fetching from: http://pod-name-7dffbdf877-6mhrn:8793/log/dag_name/task_name/2020-05-19T23:17:33.455051+00:00/1.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='pod-name-7dffbdf877-6mhrn', port=8793): Max retries exceeded with url: /log/dag_name/task_name/2020-05-19T23:17:33.455051+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fef6e00df10>: Failed to establish a new connection: [Errno 111] Connection refused'))

It seems as the pod is unable to connect to the airflow logging service, on port 8793. If I kubectl exec bash into the container, I can curl localhost on port 8080, but not on 80 and 8793.

Kubernetes deployment:

# Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: pod-name
  namespace: airflow
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pod-name
  template:
    metadata:
      labels:
        app: pod-name
    spec:
      restartPolicy: Always
      volumes:
        - name: airflow-cfg
          configMap:
            name: airflow.cfg
        - name: dags
          emptyDir: {}
      containers:
      - name: airflow-scheduler
        args:
        - airflow
        - scheduler
        image: registry.personal.io:5000/image/path
        imagePullPolicy: Always
        volumeMounts:
        - name: dags
          mountPath: /airflow_dags
        - name: airflow-cfg
          mountPath: /home/airflow/airflow.cfg
          subPath: airflow.cfg
        env:
        - name: EXECUTOR
          value: Local
        - name: LOAD_EX
          value: "n"
        - name: FORWARDED_ALLOW_IPS
          value: "*"
        ports:
          - containerPort: 8793
          - containerPort: 8080
      - name: airflow-webserver
        args:
        - airflow
        - webserver
        - --pid
        - /tmp/airflow-webserver.pid
        image: registry.personal.io:5000/image/path
        imagePullPolicy: Always
        volumeMounts:
        - name: dags
          mountPath: /airflow_dags
        - name: airflow-cfg
          mountPath: /home/airflow/airflow.cfg
          subPath: airflow.cfg
        ports:
        - containerPort: 8793
        - containerPort: 8080
        env:
        - name: EXECUTOR
          value: Local
        - name: LOAD_EX
          value: "n"
        - name: FORWARDED_ALLOW_IPS
          value: "*"

note: If airflow is run in dev environment (locally instead of kubernetes) it all works perfectly.

Upvotes: 4

Views: 15774

Answers (4)

Programmer007
Programmer007

Reputation: 87

Creating a Persistent Volume and storing logs on them might help.

--
kind: PersistentVolume
apiVersion: v1
metadata:
  name: testlog-volume
spec:
  accessModes:
    - ReadWriteMany
  capacity:
    storage: 2Gi
  hostPath:
    path: /opt/airflow/logs/
  storageClassName: standard
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: testlog-volume
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi
  storageClassName: standard

if you are using helm chart to deploy airflow, you can use

 --set executor=KubernetesExecutor --set logs.persistence.enabled=true --set logs.persistence.existingClaim=testlog-volume

Upvotes: 3

Yuzobra
Yuzobra

Reputation: 61

The problem was a bug in how the KubernetesPodExecutor from Airflow v1.10.10 tried to launch the pods. Upgrading to Airflow 2.0 solved the issue.

Upvotes: 2

Zeev
Zeev

Reputation: 11

I have outlined our solution for kubernetes executor logs here (no need for log volumes elasticsearch or any other complex solutions). https://szeevs.medium.com/handling-airflow-logs-with-kubernetes-executor-25c11ea831e4 Our use case may be a little different, but you can adapt for what you need.

Upvotes: 1

LiorH
LiorH

Reputation: 18824

Airflow deletes the pods after task completion, could it be that the pods are just missing so it can't access their logs?

Try set to see if that's the case AIRFLOW__KUBERNETES__DELETE_WORKER_PODS=False

When running airflow on Kubernetes I suggest using remote logging (e.g. s3) this way the logs are kept when the pods are deleted.

Upvotes: 1

Related Questions