Marco_81
Marco_81

Reputation: 163

Openshift 4.4 - cannot 'oc logs\exec' pods running on worker nodes

Openshift 4.4.17 cluster (3 masters and 3 workers).

Getting error when trying to see logs (or exec terminal) on those pods running on worker nodes. The same applies for Openshift GUI. No issues when trying to do the same for pods running on master nodes.

Example 1: pods running on worker

$ oc whoami
kube:admin
$ oc get pod -n lamp
NAME                         READY   STATUS    RESTARTS   AGE
lamp-lamp-6c7d9f467d-jsn4t   3/3     Running   0          108d

$ oc logs lamp-lamp-6c7d9f467d-jsn4t httpd -n lamp
error: You must be logged in to the server (the server has asked for the client to provide credentials ( pods/log lamp-lamp-6c7d9f467d-jsn4t))

Example 2: pods running on master nodes

$ oc get pod -n openshift-apiserver
NAME                       READY   STATUS    RESTARTS   AGE
apiserver-6d64545f-5lmb8   1/1     Running   0          2d19h
apiserver-6d64545f-hktqd   1/1     Running   0          2d19h
apiserver-6d64545f-kb4qt   1/1     Running   0          2d19h

$ oc logs apiserver-6d64545f-5lmb8 -n openshift-apiserver
Copying system trust bundle
I0225 20:41:39.989689       1 requestheader_controller.go:243] (..output omitted..)

Investigating kubelet on worker nodes:

On every worker node kubelet service is running, but

journalctl -u kubelet 

shows these two lines:

Unable to authenticate the request due to an error: x509: certificate signed by unknown authority
logging error output: "Unauthorized"

About kubeconfig on worker nodes:

Watching the content of /etc/kubernetes/kubeconfig file.

- kubelet connects to api-server                --> https://api-int.ocs-cls1.mycompany.lab
- the server passes valid certificate signed by --> kube-apiserver-lb-signer
- certificate-authority-data carries            --> kube-apiserver-lb-signer rootCA

The kubeconfig looks like correct.

UPDATE:

Also noticed these log lines reporting bad certificate:

$ oc -n openshift-apiserver logs apiserver-6d64545f-5lmb8
log.go:172] http: TLS handshake error from 10.128.0.12:47078: remote error: tls: bad certificate
...

UPDATE2:

Also checked apiserver-loopback-client certificate:

$ curl --resolve apiserver-loopback-client:6443:{IP_MASTER} -v -k https://apiserver-loopback-client:6443/healthz
server certificate verification SKIPPED
*        server certificate status verification SKIPPED
*        common name: apiserver-loopback-client@1614330374 (matched)
*        server certificate expiration date OK
*        server certificate activation date OK
*        certificate public key: RSA
*        certificate version: #3
*        subject: CN=apiserver-loopback-client@1614330374
*        start date: Fri, 26 Feb 2021 08:06:13 GMT
*        expire date: Sat, 26 Feb 2022 08:06:13 GMT
*        issuer: CN=apiserver-loopback-client-ca@1614330374

Upvotes: 2

Views: 2576

Answers (1)

张馆长
张馆长

Reputation: 1869

try this

while :;do
  sleep 2
  oc get csr -o name | xargs -r oc adm certificate approve
done

use the another terminal, and ssh to the any master node, run this:

crictl ps -a | awk '/Running/&&/-cert-syncer/{print $1}' | xargs -r crictl stop

Upvotes: 1

Related Questions