Reputation: 21472
our server running using Kubernetes for auto-scaling and we use newRelic for observability but we face some issues
1- we need to restart pods when memory usage reaches 1G it automatically restarts when it reaches 1.2G but everything goes slowly.
2- terminate pods when there no requests to the server
my configuration
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Release.Name }}
labels:
app: {{ .Release.Name }}
spec:
revisionHistoryLimit: 2
replicas: {{ .Values.replicas }}
selector:
matchLabels:
app: {{ .Release.Name }}
template:
metadata:
labels:
app: {{ .Release.Name }}
spec:
containers:
- name: {{ .Release.Name }}
image: "{{ .Values.imageRepository }}:{{ .Values.tag }}"
env:
{{- include "api.env" . | nindent 12 }}
resources:
limits:
memory: {{ .Values.memoryLimit }}
cpu: {{ .Values.cpuLimit }}
requests:
memory: {{ .Values.memoryRequest }}
cpu: {{ .Values.cpuRequest }}
imagePullSecrets:
- name: {{ .Values.imagePullSecret }}
{{- if .Values.tolerations }}
tolerations:
{{ toYaml .Values.tolerations | indent 8 }}
{{- end }}
{{- if .Values.nodeSelector }}
nodeSelector:
{{ toYaml .Values.nodeSelector | indent 8 }}
{{- end }}
my values file
memoryLimit: "2Gi"
cpuLimit: "1.0"
memoryRequest: "1.0Gi"
cpuRequest: "0.75"
thats what I am trying to approach
Upvotes: 2
Views: 5282
Reputation: 6851
If you want to be sure your pod/deployment won't consume more than 1.0Gi
of memory then setting that MemoryLimit
will do job just fine.
Once you set that limits and your container exceed it it becomes a potential candidate for termination. If it continues to consume memory beyond its limit, the Container will be terminated. If a terminated Container can be restarted, kubelet restarts it, as with any other type of runtime container failure.
For more readying please visit section exceeding a container's memory limit
Moving on if you wish to scale your deployment based on requests you would require to have custom metrics to be provided by external adapter such as prometheus. Horizontal pod autoascaler natively provides you scaling based only on CPU and Memory (based on the metrics from metrics server).
The adapter documents provides you walkthrough how to configure it with Kubernetes API and HPA. The list of other adapters can be found here.
Then you can scale your deployment based on the http_requests
metric as showed here or request-per-seconds
as described here.
Upvotes: 2