Gowmi
Gowmi

Reputation: 611

Kubernetes metrics server API

I am running a Kuberentes cluster in dev environment. I executed deployment files for metrics server, my pod is up and running without any error message. See the output here:

root@master:~/pre-release# kubectl get pod -o wide
NAME                              READY   STATUS    RESTARTS   AGE   IP           NODE       NOMINATED NODE   READINESS GATES
metrics-server-568697d856-9jshp   1/1     Running   0          10m   10.244.1.5   worker-1   <none>           <none>

Next when I am checking API service status, it shows up as below


Name:         v1beta1.metrics.k8s.io
Namespace:
Labels:       k8s-app=metrics-server
Annotations:  <none>
API Version:  apiregistration.k8s.io/v1
Kind:         APIService
Metadata:
  Creation Timestamp:  2021-03-29T17:32:16Z
  Resource Version:    39213
  UID:                 201f685d-9ef5-4f0a-9749-8004d4d529f4
Spec:
  Group:                     metrics.k8s.io
  Group Priority Minimum:    100
  Insecure Skip TLS Verify:  true
  Service:
    Name:            metrics-server
    Namespace:       pre-release
    Port:            443
  Version:           v1beta1
  Version Priority:  100
Status:
  Conditions:
    Last Transition Time:  2021-03-29T17:32:16Z
    Message:               failing or missing response from https://10.105.171.253:443/apis/metrics.k8s.io/v1beta1: Get "https://10.105.171.253:443/apis/metrics.k8s.io/v1beta1": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available
Events:                    <none>

Here the metric server deployment code

containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=443
        - --kubelet-insecure-tls
        - --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
        - --kubelet-use-node-status-port
        image: k8s.gcr.io/metrics-server/metrics-server:v0.4.2

Here the complete code

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: pre-release
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
    rbac.authorization.k8s.io/aggregate-to-edit: "true"
    rbac.authorization.k8s.io/aggregate-to-view: "true"
  name: system:aggregated-metrics-reader
rules:
- apiGroups:
  - metrics.k8s.io
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - nodes
  - nodes/stats
  - namespaces
  - configmaps
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server-auth-reader
  namespace: pre-release
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: pre-release
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server:system:auth-delegator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: pre-release
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:metrics-server
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: pre-release
---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: pre-release
spec:
  ports:
  - name: https
    port: 443
    protocol: TCP
    targetPort: https
  selector:
    k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: pre-release
spec:
  selector:
    matchLabels:
      k8s-app: metrics-server
  strategy:
    rollingUpdate:
      maxUnavailable: 0
  template:
    metadata:
      labels:
        k8s-app: metrics-server
    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=443
        - --kubelet-preferred-address-types=InternalIP
        - --kubelet-insecure-tls
        - --kubelet-use-node-status-port
        image: k8s.gcr.io/metrics-server/metrics-server:v0.4.2
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
           httpGet:
            path: /livez
            port: https
            scheme: HTTPS
          periodSeconds: 10
        name: metrics-server
        ports:
        - containerPort: 443
          name: https
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /readyz
            port: https
            scheme: HTTPS
          periodSeconds: 10
        securityContext:
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
        volumeMounts:
        - mountPath: /tmp
          name: tmp-dir
      nodeSelector:
        kubernetes.io/os: linux
      priorityClassName: system-cluster-critical
      serviceAccountName: metrics-server
      volumes:
       - emptyDir: {}
        name: tmp-dir
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  labels:
    k8s-app: metrics-server
  name: v1beta1.metrics.k8s.io
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: metrics-server
    namespace: pre-release
  version: v1beta1
  versionPriority: 100

latest error

I0330 09:02:31.705767       1 secure_serving.go:116] Serving securely on [::]:4443

E0330 09:04:01.718135       1 manager.go:111] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:worker-2: unable to fetch metrics from Kubelet worker-2 (worker-2): Get https://worker-2:10250/stats/summary?only_cpu_and_memory=true: dial tcp: lookup worker-2 on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:master: unable to fetch metrics from Kubelet master (master): Get https://master:10250/stats/summary?only_cpu_and_memory=true: dial tcp: lookup master on 10.96.0.10:53: read udp 10.244.2.23:41419->10.96.0.10:53: i/o timeout, unable to fully scrape metrics from source kubelet_summary:worker-1: unable to fetch metrics from Kubelet worker-1 (worker-1): Get https://worker-1:10250/stats/summary?only_cpu_and_memory=true: dial tcp: i/o timeout]

Could someone please help me to fix the issue.

Upvotes: 1

Views: 2884

Answers (2)

Jakub
Jakub

Reputation: 8840

As I mentioned in the comment section, this may be fixed by adding hostNetwork:true to the metrics-server Deployment.

According to kubernetes documentation:

HostNetwork - Controls whether the pod may use the node network namespace. Doing so gives the pod access to the loopback device, services listening on localhost, and could be used to snoop on network activity of other pods on the same node.

spec:
  hostNetwork: true   <--- 
  containers:
  - args:
    - /metrics-server
    - --cert-dir=/tmp
    - --secure-port=4443
    - --kubelet-preferred-address-types=InternalIP
    - --kubelet-use-node-status-port
    - --kubelet-insecure-tls

There is an example with information how can you edit your deployment to include that hostNetwork:true in your metrics-server deployment.

Also related github issue.

Upvotes: 1

rock&#39;n rolla
rock&#39;n rolla

Reputation: 2231

Following container arguments work for me in our development cluster

containers:
- args:
  - /metrics-server
  - --cert-dir=/tmp
  - --secure-port=443
  - --kubelet-preferred-address-types=InternalIP
  - --kubelet-insecure-tls

Result of kubectl describe apiservice v1beta1.metrics.k8s.io:

Status:
  Conditions:
    Last Transition Time:  2021-03-29T19:19:20Z
    Message:               all checks passed
    Reason:                Passed
    Status:                True
    Type:                  Available

Give it a try.

Upvotes: 1

Related Questions