Reputation: 39005
I am using Kubernetes Jenkins to build the project, but sometimes when Jenkins starts a pod, it shows launching..... then suspended. and when I check the log output it shows 404.
HTTP ERROR 404 Not Found
URI: /computer/default-j07v7/log
STATUS: 404
MESSAGE: Not Found
SERVLET: Stapler
Powered by Jetty:// 9.4.27.v20200227
This error looks like:
When the pod is suspended and goes to relaunching, again and again. The pod created events look normal:
Normal Scheduled default-scheduler Successfully assigned infrastructure/default-v7m44 to k8sslave3
Normal Pulled 1 2020-08-16T08:29:36Z 2020-08-16T08:29:36Z kubelet Container image "jenkins/jnlp-slave:3.27-1" already present on machine
Normal Created 1 2020-08-16T08:29:36Z 2020-08-16T08:29:36Z kubelet Created container jnlp
Normal Started 1 2020-08-16T08:29:36Z 2020-08-16T08:29:36Z kubelet Started container jnlp
What should I do to fix this problem? Trying for days and I find if I tweak any parameter of pod templdate, the agent change to suspended immediately. If keep it by default, the agent should startup normal. It is wired problem and make me confusing. This is my jenkins master deployment yaml:
kind: Deployment
apiVersion: apps/v1
metadata:
name: jenkins
namespace: infrastructure
selfLink: /apis/apps/v1/namespaces/infrastructure/deployments/jenkins
uid: 3df24fd6-ffaf-4f17-8b02-a2904cabbf95
resourceVersion: '1707498'
generation: 38
creationTimestamp: '2020-07-18T14:48:47Z'
labels:
app.kubernetes.io/component: jenkins-master
app.kubernetes.io/instance: jenkins
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: jenkins
helm.sh/chart: jenkins-2.4.1
annotations:
deployment.kubernetes.io/revision: '10'
meta.helm.sh/release-name: jenkins
meta.helm.sh/release-namespace: infrastructure
managedFields:
- manager: Go-http-client
operation: Update
apiVersion: apps/v1
time: '2020-08-02T10:08:04Z'
fieldsType: FieldsV1
- manager: dashboard
operation: Update
apiVersion: apps/v1
time: '2020-08-17T14:27:59Z'
fieldsType: FieldsV1
fieldsV1:
'f:spec':
'f:template':
'f:spec':
'f:containers':
'k:{"name":"jenkins"}':
'f:volumeMounts':
'k:{"mountPath":"/usr/bin/docker"}':
.: {}
'f:mountPath': {}
'f:name': {}
'k:{"mountPath":"/var/run/docker.sock"}':
.: {}
'f:mountPath': {}
'f:name': {}
'f:securityContext':
'f:runAsUser': {}
'f:volumes':
'k:{"name":"docker"}':
.: {}
'f:hostPath':
.: {}
'f:path': {}
'f:type': {}
'f:name': {}
'k:{"name":"dockersock"}':
.: {}
'f:hostPath':
.: {}
'f:path': {}
'f:type': {}
'f:name': {}
- manager: kube-controller-manager
operation: Update
apiVersion: apps/v1
time: '2020-08-18T16:14:00Z'
fieldsType: FieldsV1
fieldsV1:
'f:metadata':
'f:annotations':
'f:deployment.kubernetes.io/revision': {}
'f:status':
'f:availableReplicas': {}
'f:conditions':
.: {}
'k:{"type":"Available"}':
.: {}
'f:lastTransitionTime': {}
'f:lastUpdateTime': {}
'f:message': {}
'f:reason': {}
'f:status': {}
'f:type': {}
'k:{"type":"Progressing"}':
.: {}
'f:lastTransitionTime': {}
'f:lastUpdateTime': {}
'f:message': {}
'f:reason': {}
'f:status': {}
'f:type': {}
'f:observedGeneration': {}
'f:readyReplicas': {}
'f:replicas': {}
'f:updatedReplicas': {}
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/component: jenkins-master
app.kubernetes.io/instance: jenkins
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/component: jenkins-master
app.kubernetes.io/instance: jenkins
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: jenkins
helm.sh/chart: jenkins-2.4.1
annotations:
checksum/config: 60990c68bb90ec59c79d56498da29d250d8da13cfbb9c35cad53f0cd789f318b
spec:
volumes:
- name: plugins
emptyDir: {}
- name: tmp
emptyDir: {}
- name: jenkins-config
configMap:
name: jenkins
defaultMode: 420
- name: secrets-dir
emptyDir: {}
- name: plugin-dir
emptyDir: {}
- name: jenkins-home
persistentVolumeClaim:
claimName: jenkins
- name: sc-config-volume
emptyDir: {}
- name: dockersock
hostPath:
path: /var/run/docker.sock
type: ''
- name: docker
hostPath:
path: /usr/bin/docker
type: ''
initContainers:
- name: copy-default-config
image: 'jenkins/jenkins:lts'
command:
- sh
- /var/jenkins_config/apply_config.sh
env:
- name: ADMIN_PASSWORD
valueFrom:
secretKeyRef:
name: jenkins
key: jenkins-admin-password
- name: ADMIN_USER
valueFrom:
secretKeyRef:
name: jenkins
key: jenkins-admin-user
resources:
limits:
cpu: '2'
memory: 4Gi
requests:
cpu: 50m
memory: 256Mi
volumeMounts:
- name: tmp
mountPath: /tmp
- name: jenkins-home
mountPath: /var/jenkins_home
- name: jenkins-config
mountPath: /var/jenkins_config
- name: secrets-dir
mountPath: /usr/share/jenkins/ref/secrets/
- name: plugins
mountPath: /usr/share/jenkins/ref/plugins
- name: plugin-dir
mountPath: /var/jenkins_plugins
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: Always
containers:
- name: jenkins
image: 'jenkins/jenkins:lts'
args:
- '--argumentsRealm.passwd.$(ADMIN_USER)=$(ADMIN_PASSWORD)'
- '--argumentsRealm.roles.$(ADMIN_USER)=admin'
- '--httpPort=8080'
ports:
- name: http
containerPort: 8080
protocol: TCP
- name: slavelistener
containerPort: 50000
protocol: TCP
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: JAVA_OPTS
value: |
-Dcasc.reload.token=$(POD_NAME)
- name: JENKINS_OPTS
- name: JENKINS_SLAVE_AGENT_PORT
value: '50000'
- name: ADMIN_PASSWORD
valueFrom:
secretKeyRef:
name: jenkins
key: jenkins-admin-password
- name: ADMIN_USER
valueFrom:
secretKeyRef:
name: jenkins
key: jenkins-admin-user
- name: CASC_JENKINS_CONFIG
value: /var/jenkins_home/casc_configs
resources:
limits:
cpu: '2'
memory: 4Gi
requests:
cpu: 50m
memory: 256Mi
volumeMounts:
- name: tmp
mountPath: /tmp
- name: jenkins-home
mountPath: /var/jenkins_home
- name: jenkins-config
readOnly: true
mountPath: /var/jenkins_config
- name: secrets-dir
mountPath: /usr/share/jenkins/ref/secrets/
- name: plugin-dir
mountPath: /usr/share/jenkins/ref/plugins/
- name: sc-config-volume
mountPath: /var/jenkins_home/casc_configs
- name: dockersock
mountPath: /var/run/docker.sock
- name: docker
mountPath: /usr/bin/docker
livenessProbe:
httpGet:
path: /login
port: http
scheme: HTTP
initialDelaySeconds: 90
timeoutSeconds: 5
periodSeconds: 10
successThreshold: 1
failureThreshold: 5
readinessProbe:
httpGet:
path: /login
port: http
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: Always
- name: jenkins-sc-config
image: 'kiwigrid/k8s-sidecar:0.1.144'
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: LABEL
value: jenkins-jenkins-config
- name: FOLDER
value: /var/jenkins_home/casc_configs
- name: NAMESPACE
value: infrastructure
- name: REQ_URL
value: >-
http://localhost:8080/reload-configuration-as-code/?casc-reload-token=$(POD_NAME)
- name: REQ_METHOD
value: POST
- name: REQ_RETRY_CONNECT
value: '10'
resources: {}
volumeMounts:
- name: sc-config-volume
mountPath: /var/jenkins_home/casc_configs
- name: jenkins-home
mountPath: /var/jenkins_home
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
serviceAccountName: jenkins
serviceAccount: jenkins
securityContext:
runAsUser: 0
fsGroup: 976
schedulerName: default-scheduler
strategy:
type: Recreate
revisionHistoryLimit: 10
progressDeadlineSeconds: 600
status:
observedGeneration: 38
replicas: 1
updatedReplicas: 1
readyReplicas: 1
availableReplicas: 1
conditions:
- type: Progressing
status: 'True'
lastUpdateTime: '2020-08-17T14:45:20Z'
lastTransitionTime: '2020-08-17T14:45:20Z'
reason: NewReplicaSetAvailable
message: ReplicaSet "jenkins-7454db64f6" has successfully progressed.
- type: Available
status: 'True'
lastUpdateTime: '2020-08-18T16:14:00Z'
lastTransitionTime: '2020-08-18T16:14:00Z'
reason: MinimumReplicasAvailable
message: Deployment has minimum availability.
this is part of log output in master pod:
2020-08-21 16:44:40.381+0000 [id=955] WARNING i.f.k.c.d.i.WatchConnectionManager$1#onFailure: Exec Failure
java.util.concurrent.RejectedExecutionException: Task okhttp3.RealCall$AsyncCall@2fb3e877 rejected from java.util.concurrent.ThreadPoolExecutor@9ce8b47[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 18]
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
at okhttp3.RealCall$AsyncCall.executeOn(RealCall.java:183)
Caused: java.io.InterruptedIOException: executor rejected
at okhttp3.RealCall$AsyncCall.executeOn(RealCall.java:186)
at okhttp3.Dispatcher.promoteAndExecute(Dispatcher.java:186)
at okhttp3.Dispatcher.enqueue(Dispatcher.java:137)
at okhttp3.RealCall.enqueue(RealCall.java:127)
at okhttp3.internal.ws.RealWebSocket.connect(RealWebSocket.java:193)
at okhttp3.OkHttpClient.newWebSocket(OkHttpClient.java:435)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.runWatch(WatchConnectionManager.java:158)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.access$1200(WatchConnectionManager.java:50)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2$1.execute(WatchConnectionManager.java:321)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$NamedRunnable.run(WatchConnectionManager.java:410)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-08-21 16:44:45.239+0000 [id=33] INFO hudson.slaves.NodeProvisioner#lambda$update$6: default-3393d provisioning successfully completed. We have now 3 computer(s)
2020-08-21 16:44:45.241+0000 [id=2765] INFO o.c.j.p.k.KubernetesLauncher#launch: Created Pod: infrastructure/default-3393d
2020-08-21 16:44:45.302+0000 [id=2826] INFO o.internal.platform.Platform#log: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
2020-08-21 16:44:45.350+0000 [id=2765] INFO o.internal.platform.Platform#log: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
2020-08-21 16:44:55.363+0000 [id=2765] WARNING o.c.j.p.k.KubernetesLauncher#launch: Error in provisioning; agent=KubernetesSlave name: default-3393d, template=PodTemplate{inheritFrom='', name='default', namespace='', hostNetwork=false, activeDeadlineSeconds=10, label='jenkins-jenkins-slave ', serviceAccount='default', nodeSelector='', nodeUsageMode=NORMAL, workspaceVolume=EmptyDirWorkspaceVolume [memory=false], containers=[ContainerTemplate{name='jnlp', image='jenkins/jnlp-slave:3.27-1', workingDir='/home/jenkins', command='/bin/sh -c', args='${computer.jnlpmac} ${computer.name}', resourceRequestCpu='512m', resourceRequestMemory='512Mi', resourceLimitCpu='512m', resourceLimitMemory='512Mi', envVars=[ContainerEnvVar [getValue()=http://jenkins.infrastructure.svc.cluster.local:8080, getKey()=JENKINS_URL]], livenessProbe=org.csanchez.jenkins.plugins.kubernetes.ContainerLivenessProbe@5187faf3}]}
java.lang.IllegalStateException: Pod has terminated containers: infrastructure/default-3393d (jnlp)
at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.periodicAwait(AllContainersRunningPodWatcher.java:133)
at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.periodicAwait(AllContainersRunningPodWatcher.java:154)
at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.await(AllContainersRunningPodWatcher.java:94)
at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:140)
at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:296)
at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-08-21 16:44:55.363+0000 [id=2765] INFO o.c.j.p.k.KubernetesSlave#_terminate: Terminating Kubernetes instance for agent default-3393d
Terminated Kubernetes instance for agent infrastructure/default-3393d
Disconnected computer default-3393d
2020-08-21 16:44:55.383+0000 [id=2765] INFO o.c.j.p.k.KubernetesSlave#deleteSlavePod: Terminated Kubernetes instance for agent infrastructure/default-3393d
2020-08-21 16:44:55.383+0000 [id=2765] INFO o.c.j.p.k.KubernetesSlave#_terminate: Disconnected computer default-3393d
2020-08-21 16:45:05.198+0000 [id=42] INFO o.c.j.p.k.KubernetesCloud#provision: Excess workload after pending Kubernetes agents: 1
2020-08-21 16:45:05.198+0000 [id=42] INFO o.c.j.p.k.KubernetesCloud#provision: Template for label null: default
2020-08-21 16:45:12.383+0000 [id=955] WARNING i.f.k.c.d.i.WatchConnectionManager$1#onFailure: Exec Failure
java.util.concurrent.RejectedExecutionException: Task okhttp3.RealCall$AsyncCall@6c6c7a45 rejected from java.util.concurrent.ThreadPoolExecutor@9ce8b47[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 18]
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
at okhttp3.RealCall$AsyncCall.executeOn(RealCall.java:183)
Caused: java.io.InterruptedIOException: executor rejected
at okhttp3.RealCall$AsyncCall.executeOn(RealCall.java:186)
at okhttp3.Dispatcher.promoteAndExecute(Dispatcher.java:186)
at okhttp3.Dispatcher.enqueue(Dispatcher.java:137)
at okhttp3.RealCall.enqueue(RealCall.java:127)
at okhttp3.internal.ws.RealWebSocket.connect(RealWebSocket.java:193)
at okhttp3.OkHttpClient.newWebSocket(OkHttpClient.java:435)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.runWatch(WatchConnectionManager.java:158)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.access$1200(WatchConnectionManager.java:50)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2$1.execute(WatchConnectionManager.java:321)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$NamedRunnable.run(WatchConnectionManager.java:410)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-08-21 16:45:15.236+0000 [id=2765] INFO o.c.j.p.k.KubernetesLauncher#launch: Created Pod: infrastructure/default-03q6x
2020-08-21 16:45:15.252+0000 [id=36] INFO hudson.slaves.NodeProvisioner#lambda$update$6: default-03q6x provisioning successfully completed. We have now 3 computer(s)
2020-08-21 16:45:15.314+0000 [id=2824] INFO o.internal.platform.Platform#log: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
2020-08-21 16:45:15.381+0000 [id=2765] INFO o.internal.platform.Platform#log: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
2020-08-21 16:45:25.390+0000 [id=2765] WARNING o.c.j.p.k.KubernetesLauncher#launch: Error in provisioning; agent=KubernetesSlave name: default-03q6x, template=PodTemplate{inheritFrom='', name='default', namespace='', hostNetwork=false, activeDeadlineSeconds=10, label='jenkins-jenkins-slave ', serviceAccount='default', nodeSelector='', nodeUsageMode=NORMAL, workspaceVolume=EmptyDirWorkspaceVolume [memory=false], containers=[ContainerTemplate{name='jnlp', image='jenkins/jnlp-slave:3.27-1', workingDir='/home/jenkins', command='/bin/sh -c', args='${computer.jnlpmac} ${computer.name}', resourceRequestCpu='512m', resourceRequestMemory='512Mi', resourceLimitCpu='512m', resourceLimitMemory='512Mi', envVars=[ContainerEnvVar [getValue()=http://jenkins.infrastructure.svc.cluster.local:8080, getKey()=JENKINS_URL]], livenessProbe=org.csanchez.jenkins.plugins.kubernetes.ContainerLivenessProbe@5187faf3}]}
java.lang.IllegalStateException: Pod has terminated containers: infrastructure/default-03q6x (jnlp)
at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.periodicAwait(AllContainersRunningPodWatcher.java:133)
at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.periodicAwait(AllContainersRunningPodWatcher.java:154)
at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.await(AllContainersRunningPodWatcher.java:94)
at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:140)
at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:296)
at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-08-21 16:45:25.391+0000 [id=2765] INFO o.c.j.p.k.KubernetesSlave#_terminate: Terminating Kubernetes instance for agent default-03q6x
Terminated Kubernetes instance for agent infrastructure/default-03q6x
and now this is my kubernetes cloud template snapshot:
this is the pod template config:
Upvotes: 4
Views: 6286
Reputation: 9212
I would suggest few changes do it like this
Keep everything blank for jenkins tunnel
. Jenkins will automatically will pick it up.
If you deployed this jenkins instance in kubernetes cluster then please use internal address for jenkins_url
like http://jenkins.infrastructure.svc
i assume your jenkins service name is jenkins
and it is ClusterIP
Upvotes: 2