Reputation: 2137
We are working on an akka-cluster based application which we run in a Kubernetes cluster. We are now in a situation where we will like the application to scale up in case there is an increase in load on the cluster. We are using HorizontalPodAutoscaler
to achieve this. Our manifest files looks like:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: app
namespace: some-namespace
labels:
componentName: our-component
app: our-component
version: some-version
annotations:
prometheus.io/scrape: "true"
prometheus.io/path: "/metrics"
prometheus.io/port: "9252"
spec:
serviceName: our-component
replicas: 2
selector:
matchLabels:
componentName: our-component
app: our-app
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
componentName: our-component
app: our-app
spec:
containers:
- name: our-component-container
image: image-path
imagePullPolicy: Always
resources:
requests:
cpu: .1
memory: 500Mi
limits:
cpu: 1
memory: 1Gi
command:
- "/microservice/bin/our-component"
ports:
- name: remoting
containerPort: 8080
protocol: TCP
readinessProbe:
httpGet:
path: /ready
port: 9085
initialDelaySeconds: 40
periodSeconds: 30
failureThreshold: 3
timeoutSeconds: 30
livenessProbe:
httpGet:
path: /alive
port: 9085
initialDelaySeconds: 130
periodSeconds: 30
failureThreshold: 3
timeoutSeconds: 5
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: app-hpa
namespace: some-namespace
labels:
componentName: our-component
app: our-app
spec:
minReplicas: 2
maxReplicas: 8
scaleTargetRef:
apiVersion: apps/v1
kind: StatefulSet
name: our-component
targetCPUUtilizationPercentage: 75
The issue we face is, as soon as we deploy the application it scales to the maxReplicas
defined, even if there is no load on the application. Also, the application never seem to scale down.
Can someone who faced a similar issue in their application share their experience of why this happens and if they were able to resolve this?
Upvotes: 0
Views: 160
Reputation: 4800
I suspect this is just because you have such a low CPU request value. With a 0.1 request anytime there is more than 10% of a vCPU being used by a Pod it's going to create new pods. And startup activity could easily use more than a 0.1 CPU. So even the startup activity is enough to force the HBA to spawn more pods. But then those new pods get added to the cluster, so there is consensus activity. Which, again, might push all of the pods above an average 0.1 request.
It's a little surprising to me, because you'd think an idle application would stabilize below 0.1 vCPU, but 0.1 vCPU is very tiny.
I'd test:
Upvotes: 0
Reputation: 2909
The issue is with the resource request and limits. You have set requested CPU to be "1" whereas limit to be "0.1". Therefore, what is happening is, as soon as your pod runs, the limit is naturally exceeded, and the autoscaling kicks in and keeps scaling to the max number of replicas.
You need to switch the parameter names so that your request becomes 0.1 and limit becomes 1.0. This way, your pod will start with 0.1 units of CPU shares, and once average usage grows larger than 65% across all pods, you will get more replicas, and if the usage drops, you will have a scale down, just as expected.
As a general rule of thumb, request is less than limit, or at least equal to it. It cannot be higher, because you end up with an infinitely scaling infrastructure.
Upvotes: 1