Kunfu Panda
Kunfu Panda

Reputation: 77

Deleting failed jobs in kubernetes (gke)

How to delete the failed jobs in the kubernetes cluster using a cron job in gke?. when i tried to delete the failed jobs using following YAML, it has deleted all the jobs (including running)


apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: XXX
namespace: XXX
spec:
schedule: "*/30 * * * *"
failedJobsHistoryLimit: 1
successfulJobsHistoryLimit: 1
jobTemplate:
 spec:
   template:
     spec:
       serviceAccountName: XXX
       containers:
       - name: kubectl-runner
         image: bitnami/kubectl:latest
         command: ["sh", "-c", "kubectl delete jobs $(kubectl get jobs | awk '$2 ~ 1/1' | awk '{print $1}')"]
       restartPolicy: OnFailure

Upvotes: 1

Views: 12301

Answers (3)

Vasilii Angapov
Vasilii Angapov

Reputation: 9022

This one visually looks better for me:

kubectl delete job --field-selector=status.phase==Failed

Upvotes: 3

Amit Baranes
Amit Baranes

Reputation: 8122

@Dawid Kruk answer is excellent but working on a specific namespace only and not for all namespaces as I needed. in order to solve it, I've created a simple bash script that gets all failed jobs and delete them -

# Delete failed jobs
failedJobs=$(kubectl get job -A -o=jsonpath='{range .items[?(@.status.failed>=1)]}{.metadata.name}{"\t"}{.metadata.namespace}{"\n"}{end}')
echo "$failedJobs" | while read each
do
  array=($each)
  jobName=${array[0]}
  namespace=${array[1]}
  echo "Debug: job name: $jobName is deleted on namespace $namespace"
  kubectl delete job  $jobName -n $namespace
done

Upvotes: 1

Dawid Kruk
Dawid Kruk

Reputation: 9877

To delete failed Jobs in GKE you will need to use following command:

  • $ kubectl delete job $(kubectl get job -o=jsonpath='{.items[?(@.status.failed==1)].metadata.name}')

This command will output the JSON for all jobs and search for jobs that have status.failed field set to 1. It will then pass the failed jobs to $ kubectl delete jobs


This command ran in a CronJob will fail when there are no jobs with status: failed.

As a workaround you can use:

command: ["sh", "-c", "kubectl delete job --ignore-not-found=true $(kubectl get job -o=jsonpath='{.items[?(@.status.failed==1)].metadata.name}'); exit 0"]

exit 0 was added to make sure that the Pod will leave with status code 0


As for part of the comments made under the question:

You will need to modify it it support "Failed" Jobs

I have already tried the following , but it's not deleting the jobs. kubectl delete job $(kubectl get job -o=jsonpath='{.items[?(@.status.Failed==1)].metadata.name}')

  • @.status.Failed==1 <-- incorrect as JSON is case sensitive
  • @.status.failed==1 <-- correct

If you were to run incorrect version of this command on following Pods (to show that they failed and aren't still running to completion):

NAME              READY   STATUS      RESTARTS   AGE
job-four-9w5h9    0/1     Error       0          5s
job-one-n9trm     0/1     Completed   0          6s
job-three-nhqb6   0/1     Error       0          5s
job-two-hkz8r     0/1     Error       0          6s

You should get the following error :

error: resource(s) were provided, but no name, label selector, or --all flag specified

Above error will also show when there was no jobs passed to $ kubectl delete job.

Running correct version of this command should delete all jobs that failed:

job.batch "job-four" deleted
job.batch "job-three" deleted
job.batch "job-two" deleted

I encourage you to check additional resources:

Upvotes: 2

Related Questions