Reputation: 7502
I have a Cronjob in Kubernetes that runs every 3 minutes. It seems to be running the job fine as shown below however the generated pod immediately deletes itself and I cannot look at any details as to why it gets deleted.
The cronjob skeleton is below,
apiVersion: batch/v1beta1
kind: CronJob
...
spec:
schedule: "*/3 * * * *"
successfulJobsHistoryLimit: 1
failedJobsHistoryLimit: 3
concurrencyPolicy: Forbid
startingDeadlineSeconds: 120
jobTemplate:
spec:
backoffLimit: 2
template:
spec:
...
This generates the cronjob as below,
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
test */3 * * * * False 0 1m 51m
The job generated by this,
NAME DESIRED SUCCESSFUL AGE
test-1552177080 1 0 8m
test-1552177260 1 0 5m
test-1552177440 1 0 2m
Looking at the details into one of these jobs i can see,
Name: test-1552177440
Namespace: storage
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 2m57s job-controller Created pod: test-1552177440-b5d6g
Normal SuccessfulDelete 2m40s job-controller Deleted pod: test-1552177440-b5d6g
Warning BackoffLimitExceeded 2m40s (x2 over 2m40s) job-controller Job has reached the specified backoff limit
As you can see the pod is deleted immediately with SuccessfulDelete
.
Is there anyway to stop this from happening ? Ultimately, id like to look at any logs or any details as to why the pod doesnt start.
Upvotes: 1
Views: 2371
Reputation: 68
I have had the same problem.
ref: https://github.com/kubernetes/kubernetes/issues/78644#issuecomment-498165434
Once a job had failed, (this occurs when it's exceeded its active deadline seconds or backoff limit) any active pods are deleted to prevent them from running/crashlooping forever. Any pods that aren't active e.g they are in a Pod phase of Failed or Succeeded, should be left around.
If you want your pods to be left around after failure, changing the restart policy of your pods toNever
should prevent them from being immediately cleaned up, however this does mean that a new pod will be created each time your pods fail until the backoff limit is reached.
Can you try to fix restartPolicy to Never
?
apiVersion: batch/v1beta1
kind: CronJob
...
spec:
schedule: "*/3 * * * *"
...
jobTemplate:
spec:
...
template:
spec:
...
restartPolicy: Never # Point!
Upvotes: 3