Reputation: 7251
Using HPA (Horizontal Pod Autoscaler) and Cluster Autoscaler on GKE, pods and nodes are scaled up as expected. However, when demand decrease, pods are deleted from random nodes, it seems. It causes less utilized nodes. It is not cost effective...
EDIT: HPA is based on targetCPUUtilizationPercentage single metrics. Not using VPA.
This is reducted yaml file for deployment and HPA:
apiVersion: apps/v1
kind: Deployment
metadata:
name: foo
spec:
replicas: 1
templates:
spec:
containers:
- name: c1
resources:
requests:
cpu: 200m
memory: 1.2G
- name: C2
resources:
requests:
cpu: 10m
volumeMounts:
- name: log-share
mountPath: /mnt/log-share
- name: C3
resources:
requests:
cpu: 10m
limits:
cpu: 100m
- name: log-share
mountPath: /mnt/log-share
volumes:
- name: log-share
emptyDir: {}
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: foo
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: foo
minReplicas: 1
maxReplicas: 60
targetCPUUtilizationPercentage: 80
...
EDIT2: Add an emptyDir volume to be valid example.
How do I Improve this situation?
There are some ideas, but none of them solve the issue completely...
Upvotes: 1
Views: 529
Reputation: 7251
Sorry, I failed to mention about use of emptyDir (edited yaml in the question).
As I commented on the question myself, I found What types of pods can prevent CA from removing a node? in the Autoscaler FAQ.
Pods with local storage. *
An emptyDir volume is a local storage, So I needed to add following annotation in the pod template of a deployment to mark the pod is safe to evict from less utilized nodes.
apiVersion: apps/v1
kind: Deployment
metadata:
name: foo
spec:
selector:
matchLabels:
app: foo
template:
metadata:
labels:
app: foo
annotations:
cluster-autoscaler.kubernetes.io/safe-to-evict: "true"
spec:
...
After specifying the annotation, the size of GCE instance group of the GKE node pool is smaller than before. I think it worked!
Thank you for everyone commented in the question!
Upvotes: 3