Reputation: 447
I am performing a performance test on a microservice application built in .NET 8.0, and we have recently encountered an issue. When I set a CPU limit on our pods, they begin to restart as soon as the application reaches 20 transactions per second (TPS).
I have monitored the situation using Dynatrace and various kubectl commands to check CPU and memory utilization, and I confirmed that resource usage is not exceeding the configured 60% threshold—it's even staying below 40% before the pods restart.
Despite my thorough investigation into this issue, I have not been able to find a solution. Any insights or guidance on how to resolve this issue would be greatly appreciated!
Please note when I remove the CPU limit from the deployment file, PODS are scaling correctly and there is no restart.
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 3
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-app-container
image: my-app-image:latest
resources:
requests:
memory: "1512Mi"
cpu: "2" # Request for CPU
limits:
memory: "2Gi"
cpu: "4" # Limit for CPU
ports:
- containerPort: 80
Upvotes: 0
Views: 182
Reputation: 5692
You can run out of more resources than just CPU. Pick a node, any node that is exhibiting the behavior. Profile for the highest cost request in terms of total response time. Take a look at the variance (standard deviation) of the requests. The cost plus the variance combined are prime indicators of being bound on a member of the finite resource pool.
Once you understand the highest cost item that is most likely the root of the requests locking up resources that other items cannot (or have to wait to) access, then pull in the deep diagnostic superhero tool (Dynatrace) to drill baby drill on that request to profile all of the calls for the call with the highest cost and variance. This is likely your root problem. Optimize that!
Upvotes: -1