Google cloud GKE horizontal pod autoscaling based on Kubernetes metrics

Question

I want to use the pod network received bytes count standard kubernetes metrics on HPA . Using following yaml to accomplish that but getting errors like unable to fetch metrics from custom metrics API: no custom metrics API (custom.metrics.k8s.io) registered

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: xxxx-hoa
  namespace: xxxxx
spec:
  scaleTargetRef:
    apiVersion: apps/v1beta1
    kind: Deployment
    name: xxxx-xxx
  minReplicas: 2
  maxReplicas: 6
  metrics:
  - type: Pods
    pods:
      metricName: received_bytes_count
      targetAverageValue: 20k

If anybody had an experience with same kind of metrics usage that would be greatly helpful

PjoterS · Accepted Answer

Solution

To make it work, you need to deploy Stackdriver Custom Metrics Adapter. Below commands to deploy it.

$ kubectl create clusterrolebinding cluster-admin-binding \
    --clusterrole cluster-admin --user "$(gcloud config get-value account)"

$ kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter_new_resource_model.yaml

Later you need to use proper Custom Metric, in your case it should be kubernetes.io|pod|network|received_bytes_count

Description

In Custom and external metrics for autoscaling workloads documentation you have information that you need to deploy StackDriver Adapter before you will be able to get custom metrics.

Before you can use custom metrics, you must enable Monitoring in your Google Cloud project and install the Stackdriver adapter on your cluster.

Next step is to deploy your application (I've used Nginx deployment for test purpose) and create proper HPA.

In your HPA example, you have a few issues

apiVersion: autoscaling/v2beta1 ## you can also use autoscaling/v2beta2 if you need more features, however for this scenario is ok
kind: HorizontalPodAutoscaler
metadata:
  name: xxxx-hoa
  namespace: xxxxx # HPA have namespace specified, deployment doesnt have
spec:
  scaleTargetRef:
    apiVersion: apps/v1beta1 # apiVersion: apps/v1beta1 is quite old. In Kubernetes 1.16+ it was changed to apps/v1
    kind: Deployment
    name: xxxx-xxx
  minReplicas: 2
  maxReplicas: 6
  metrics:
  - type: Pods
    pods:
      metricName: received_bytes_count # this metrics should be replaced with kubernetes.io|pod|network|received_bytes_count
      targetAverageValue: 20k

In GKE you can choose between autoscaling/v2beta1 and autoscaling/v2beta2. Your case will work with both apiVersions, however if you would decide to use autoscaling/v2beta2 you will need to change manifest syntax.

Why kubernetes.io/pod/network/received_bytes_count ? You are refering to Kubernetes metrics, and /pod/network/received_bytes_count is provided in this docs.

Why | instead of /? If you will check Stackdriver documentation on Github you will find information.

Stackdriver metrics have a form of paths separated by "/" character, but Custom Metrics API forbids using "/" character. When using Custom Metrics - Stackdriver Adapter either directly via Custom Metrics API or by specifying a custom metric in HPA, replace "/" character with "|". For example, to use custom.googleapis.com/my/custom/metric, specify custom.googleapis.com|my|custom|metric.

Proper Configuration

for v2beta1

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: xxxx-hoa
spec:
  scaleTargetRef:
    apiVersion: apps/v1 # In your case should be apps/v1beta1 but my deployment was created with apps/v1 apiVersion
    kind: Deployment
    name: nginx
  minReplicas: 2
  maxReplicas: 6
  metrics:
  - type: Pods
    pods:
      metricName: "kubernetes.io|pod|network|received_bytes_count"
      targetAverageValue: 20k

for v2beta2

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: xxxx-hoa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx
  minReplicas: 2
  maxReplicas: 6
  metrics:
  - type: Pods
    pods:
      metric:
        name: "kubernetes.io|pod|network|received_bytes_count"
      target:
        type: AverageValue
        averageValue: 20k

Tests Outputs

Conditions:
  Type            Status  Reason            Message
  ----            ------  ------            -------
  AbleToScale     True    SucceededRescale  the HPA controller was able to update the target scale to 2
  ScalingActive   True    ValidMetricFound  the HPA was able to successfully calculate a replica count from pods metric kubernetes.io|pod|network|received_bytes_count
  ScalingLimited  True    TooFewReplicas    the desired replica count is more than the maximum replica count
Events:
  Type    Reason             Age                 From                       Message
  ----    ------             ----                ----                       -------
  Normal  SuccessfulRescale  8m18s               horizontal-pod-autoscaler  New size: 4; reason: pods metric kubernetes.io|pod|network|received_bytes_count above target
  Normal  SuccessfulRescale  8m9s                horizontal-pod-autoscaler  New size: 6; reason: pods metric kubernetes.io|pod|network|received_bytes_count above target
  Normal  SuccessfulRescale  17s                 horizontal-pod-autoscaler  New size: 5; reason: All metrics below target
  Normal  SuccessfulRescale  9s (x2 over 8m55s)  horizontal-pod-autoscaler  New size: 2; reason: All metrics below target

Possible Issues with your current configuration

In your HPA you have specified namespace but not in your target Deployment. Both, HPA and deployment should have the same namespace. With this mismatch configuration you can encounter issue below:

Conditions:
  Type         Status  Reason          Message
  ----         ------  ------          -------
  AbleToScale  False   FailedGetScale  the HPA controller was unable to get the target's current scale: deployments/scale.apps "nginx" not found
Events:
  Type     Reason          Age                  From                       Message
  ----     ------          ----                 ----                       -------
  Warning  FailedGetScale  94s (x264 over 76m)  horizontal-pod-autoscaler  deployments/scale.apps "nginx" not found

In Kubernetes 1.16+, deployments are using apiVersion: apps/v1, you won't be able to create deployment with apiVersion: apps/v1beta1 in Kubernets 1.16+

Google cloud GKE horizontal pod autoscaling based on Kubernetes metrics

Answers (2)

Related Questions