Performance issue with low microservice utilization in K8s (impact to development and devops also)
When I designed microservice and made deployment to K8s, I saw that I had problem to get higher utilization for my microservices (max. utilization was only 0.1-0.3 CPU). Do you have best practices, how can we increase microservice CPU utilization?
Let me describe the LAB environment:
- K8s with 5 nodes
- each node with 14 CPU and 128 GB RAM (nodes are build on virtual machines with VMWare)
- K8s with nginx with setting full log, etc.
- Microservice
- In python language (GIL limitation for processing in one process, it means max. 1 CPU utilization)
- I used three pods
- Interface REST request/response (without addition I/O operation)
- The processing time per one call is ~100ms
We made performance tests, and you can see these outputs:
- Microservice utilization max. 0.1-0.3 CPU in each pod
I expect the issue is, that K8s management (routing, log, …) generate higher utilization of sources and cannot provide high throughput for utilization of our microservices. I think, the best practices for higher utilization of microservices can be:
1] Increase amount of pods
- Pros, we will get higher microservice utilization but amount of pods are limited per one K8s node
- Cons, the utilization of microservice per pod will be still the same
2] Use micro batch processing
- Pros, we can support bundling of calls (per e.g. one, two seconds) and in this case, that processing time on microservice side will be higher
- Cons, we will increase processing time because bundling (not ideal scenario for real-time processing)
3] K8s change log level
- Pros, we can decrease level of logs in nginx, … to error
- Cons, possible issue for detail issue tracking
4] Use K8s nodes with physical HW (not VMware)
- Pros, better performance
- Cons, this change can generate addition costs (new HW) and maintenance
Do you use other best practices, ideas for high microservice utilization in k8s (my aim is to get 0.8-1 CPU per pod for this python code)?
Answers (1)
Performance testing is a very complex topic, it requires a lot of precision when building the testing setup, and solid knowledge for all the building parts, since it's very easy to mess things up (I did that many times).
Couple of ideas from my side:
- If you run a single-threaded app on a pod with more than 1 CPU configured, then you'll never see high CPU usage on a pod level.
- Even if you run a multi-threaded app that has a heavy I/O-bound workload (lots of external HTTP calls for example), you'll still not see high CPU usage, since the threads will be most of the time in a non-runnable state.
- Kubernetes management workflows do have some overhead that can be observed when looking on cluster-level (or even node-level) metrics but pod-level metrics are fully related to your application (especially CPU usage).
So to see high CPU usage on a pod-level, you can do 2 things:
- Run a single-threaded app (that does CPU-heavy tasks) with a pod configured with 1 CPU
- If you have a multi-threaded app, the pod CPU cores should be the same with the number of threads in your app (and the app workload should be CPU-bound, of course), to get the max CPU usage.