Karan Khanna
Karan Khanna

Reputation: 2137

Akka: Application not scaling down when using Kubernetes HorizontalPodAutoscaler

We are working on an akka-cluster based application which we run in a Kubernetes cluster. We are now in a situation where we will like the application to scale up in case there is an increase in load on the cluster. We are using HorizontalPodAutoscaler to achieve this. Our manifest files looks like:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: app
  namespace: some-namespace
  labels:
    componentName: our-component
    app: our-component
    version: some-version
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/path: "/metrics"
    prometheus.io/port: "9252"
spec:
  serviceName: our-component
  replicas: 2
  selector:
    matchLabels:
      componentName: our-component
      app: our-app
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        componentName: our-component
        app: our-app
    spec:
      containers:
        - name: our-component-container
          image: image-path
          imagePullPolicy: Always
          resources:
            requests:
              cpu: .1
              memory: 500Mi
            limits:
              cpu: 1
              memory: 1Gi
          command:
            - "/microservice/bin/our-component"
          ports:
            - name: remoting
              containerPort: 8080
              protocol: TCP
          readinessProbe:
            httpGet:
              path: /ready
              port: 9085
            initialDelaySeconds: 40
            periodSeconds: 30
            failureThreshold: 3
            timeoutSeconds: 30
          livenessProbe:
            httpGet:
              path: /alive
              port: 9085
            initialDelaySeconds: 130
            periodSeconds: 30
            failureThreshold: 3
            timeoutSeconds: 5
   

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: app-hpa
  namespace: some-namespace
  labels:
    componentName: our-component
    app: our-app
spec:
  minReplicas: 2
  maxReplicas: 8
  scaleTargetRef:
    apiVersion: apps/v1
    kind: StatefulSet
    name: our-component
  targetCPUUtilizationPercentage: 75

The issue we face is, as soon as we deploy the application it scales to the maxReplicas defined, even if there is no load on the application. Also, the application never seem to scale down.

Can someone who faced a similar issue in their application share their experience of why this happens and if they were able to resolve this?

Upvotes: 0

Views: 160

Answers (2)

David Ogren
David Ogren

Reputation: 4800

I suspect this is just because you have such a low CPU request value. With a 0.1 request anytime there is more than 10% of a vCPU being used by a Pod it's going to create new pods. And startup activity could easily use more than a 0.1 CPU. So even the startup activity is enough to force the HBA to spawn more pods. But then those new pods get added to the cluster, so there is consensus activity. Which, again, might push all of the pods above an average 0.1 request.

It's a little surprising to me, because you'd think an idle application would stabilize below 0.1 vCPU, but 0.1 vCPU is very tiny.

I'd test:

  • First, just bumping up to more reasonable CPU requests. If you have 1.0 CPU request and 2.0 CPU limit, does this still happen? If not, then I was right and the request value was just set so low that "overhead" style activity could exceed the target.
  • If you still are seeing this behavior, even then, then I'd verify your HBA settings. The defaults should be OK, but I'd just validate everything. Run a describe on the hba to see the status and events. Maybe play around with periodSeconds and stabilization settings.
  • If that still doesn't give you any clues, I'd just run the HBA samples and make sure that they work. Maybe there is a problem with the collected metrics or something similar.

Upvotes: 0

zer0
zer0

Reputation: 2909

The issue is with the resource request and limits. You have set requested CPU to be "1" whereas limit to be "0.1". Therefore, what is happening is, as soon as your pod runs, the limit is naturally exceeded, and the autoscaling kicks in and keeps scaling to the max number of replicas.

You need to switch the parameter names so that your request becomes 0.1 and limit becomes 1.0. This way, your pod will start with 0.1 units of CPU shares, and once average usage grows larger than 65% across all pods, you will get more replicas, and if the usage drops, you will have a scale down, just as expected.

As a general rule of thumb, request is less than limit, or at least equal to it. It cannot be higher, because you end up with an infinitely scaling infrastructure.

Upvotes: 1

Related Questions