Reputation: 77
How to delete the failed jobs in the kubernetes cluster using a cron job in gke?. when i tried to delete the failed jobs using following YAML, it has deleted all the jobs (including running)
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: XXX
namespace: XXX
spec:
schedule: "*/30 * * * *"
failedJobsHistoryLimit: 1
successfulJobsHistoryLimit: 1
jobTemplate:
spec:
template:
spec:
serviceAccountName: XXX
containers:
- name: kubectl-runner
image: bitnami/kubectl:latest
command: ["sh", "-c", "kubectl delete jobs $(kubectl get jobs | awk '$2 ~ 1/1' | awk '{print $1}')"]
restartPolicy: OnFailure
Upvotes: 1
Views: 12301
Reputation: 9022
This one visually looks better for me:
kubectl delete job --field-selector=status.phase==Failed
Upvotes: 3
Reputation: 8122
@Dawid Kruk answer is excellent but working on a specific namespace only and not for all namespaces as I needed. in order to solve it, I've created a simple bash script that gets all failed jobs and delete them -
# Delete failed jobs
failedJobs=$(kubectl get job -A -o=jsonpath='{range .items[?(@.status.failed>=1)]}{.metadata.name}{"\t"}{.metadata.namespace}{"\n"}{end}')
echo "$failedJobs" | while read each
do
array=($each)
jobName=${array[0]}
namespace=${array[1]}
echo "Debug: job name: $jobName is deleted on namespace $namespace"
kubectl delete job $jobName -n $namespace
done
Upvotes: 1
Reputation: 9877
To delete failed Jobs
in GKE
you will need to use following command:
$ kubectl delete job $(kubectl get job -o=jsonpath='{.items[?(@.status.failed==1)].metadata.name}')
This command will output the JSON
for all jobs and search for jobs that have status.failed
field set to 1
. It will then pass the failed jobs to $ kubectl delete jobs
This command ran in a CronJob
will fail when there are no jobs with status: failed
.
As a workaround you can use:
command: ["sh", "-c", "kubectl delete job --ignore-not-found=true $(kubectl get job -o=jsonpath='{.items[?(@.status.failed==1)].metadata.name}'); exit 0"]
exit 0
was added to make sure that thePod
will leave with status code 0
As for part of the comments made under the question:
You will need to modify it it support "Failed" Jobs
I have already tried the following , but it's not deleting the jobs.
kubectl delete job $(kubectl get job -o=jsonpath='{.items[?(@.status.Failed==1)].metadata.name}')
@.status.Failed==1
<-- incorrect as JSON
is case sensitive@.status.failed==1
<-- correctIf you were to run incorrect version of this command on following Pods (to show that they failed and aren't still running to completion):
NAME READY STATUS RESTARTS AGE
job-four-9w5h9 0/1 Error 0 5s
job-one-n9trm 0/1 Completed 0 6s
job-three-nhqb6 0/1 Error 0 5s
job-two-hkz8r 0/1 Error 0 6s
You should get the following error :
error: resource(s) were provided, but no name, label selector, or --all flag specified
Above error will also show when there was no jobs passed to
$ kubectl delete job
.
Running correct version of this command should delete all jobs that failed:
job.batch "job-four" deleted
job.batch "job-three" deleted
job.batch "job-two" deleted
I encourage you to check additional resources:
Upvotes: 2