Reputation: 34979
I'm using uwsgi in a container hosted in a Kubernetes cluster. UWSGI supports a traditional master / slave architecture to provide a better availability for the application but my question is, should I even use this feature?
In other words, when I need more processes to handle and compute requests, should I increase the number of pods in the cluster or should I still use master / slave mode of UWSGI to respond to the requests?
Upvotes: 13
Views: 3692
Reputation: 22208
I think that there are 2 different discussions here.
So while in the database world this is a common pattern, in K8S it is common for workloads to answer availability/fault tolerance requirements by using K8S native support for leader election.
Both patterns are pretty much related (Master-Slave Architecture can also be called Leader-Based Replication Architecture).
In K8S, it is common to see vendors that are providing controllers in a leader election topology where in case of failure to the leader, one of the replicas will immediately replace it (after all replicas have consensus about the new leader) - this provides the system with high availability.
In some cases all replicas are working but only the leader is the decision maker while in other cases only the leader is actually working while the other is on standby.
This is a bit tricky, and it is mainly depends on your use case.
Before K8S it was trivial to use threads if you wanted to improve performance.
K8S has multiple scaling mechanisms:
Due to this fact you should ensure that the logic that is creating the threads and managing the work for each thread is not something that should be managed by HPA.
For example, let's say you want:
In this case you can set the HPA to handle the SQS logic and try to use threads for the internal processing logic of each message.
(*) Only if it makes sense - for example if you have a lot of I/O operations.
(**) And that it will not lead to issues like race condition.
But something that you need to be careful with, is NOT to mix the "external" logic that K8S in managing (HPA) with the "internal" logic (threads) that the application code is responsible for.
Upvotes: 0
Reputation: 2692
We use a deployment model in which a django based app is served by gunicorn with couple of worker processes. We have further tried to scale this pod to 2-3 replicas and have seen performance improvements.
It's totally upto what works for your app.
Advantage of scaling pods is that you can configure it dynamically, hence not wasting resources.
Upvotes: 2
Reputation: 1266
Be conscious of having enough threads/processes/pods to maintain availability if your application blocks while serving each HTTP request (e.g. Django). There is going to be some pod startup time if you're using a horizontal pod autoscaler, and I found with a high traffic application I had much better availability with uwsgi and the application within each pod (same container), and a separate nginx pod doing reverse proxying and request pooling when all the uwsgi workers were busy.
YMMV but at the end of the day, availability is more important than sticking to the single process per pod rule of thumb. Just understand the downsides, such as less isolation between the processes within the same container. Logs are available on a per container basis, so there won't be isolation between anything in the same container using the built in kubectl logs functionality.
Upvotes: 10
Reputation: 19123
Recommended way to manage this in Kubernetes is to increase the number of PODs based on the workload requirements.
Upvotes: 5