Reputation: 431
I was reading the example at kubernetes hpa example. In this example they run with: kubectl run php-apache --image=k8s.gcr.io/hpa-example --requests=cpu=200m --expose --port=80
. So the pod will ask for 200m of cpu (0.2 of each core). After that they run hpa with a target cpu of 50%: kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
. Which mean that the desired milli-core is 200m * 0.5 = 100m. They make a load test and put up a 305% load. Which mean auto scale up to: ceil((3.05 * 200m) / 100m) = 7 pods according to: hpa scaling algorith.
This is all good, but we are experimenting with different values and I wonder if it's a good approach.
We opted for a target cpu of 500% (second option). For me, target cpu >= 100% is a wierd concept (maybe I understand wrong also, please correct me as I'm not that familiar with the whole concept), but it slow down scaling compare to the inverted (first option).
Upvotes: 2
Views: 2654
Reputation: 11098
The first approach is correct.
The second one is not good for a few reasons:
Another very important thing. When planning your Horizontal Pod Autoscaler, consider total amount of resources available in your cluster so you don't find yourself in a situation when you're run out of resources.
Example: you have a system with 2-core processors which equals to having 2000 millicores available from perspective of your cluster. Let's say you decided to create following deployment:
kubectl run php-apache --image=k8s.gcr.io/hpa-example --requests=cpu=500m --expose --port=80
and then Horizontal Pod Autoscaler:
kubectl autoscale deployment php-apache --cpu-percent=100 --min=1 --max=5
This means you allow that more resources can be requested than you're actually have available in your cluster so in such situation 5th replica will never be created.
Upvotes: 2