Reputation: 34979

Is this necessary to have multiple processes / threads in a Kubernetes pod?

I'm using uwsgi in a container hosted in a Kubernetes cluster. UWSGI supports a traditional master / slave architecture to provide a better availability for the application but my question is, should I even use this feature?

In other words, when I need more processes to handle and compute requests, should I increase the number of pods in the cluster or should I still use master / slave mode of UWSGI to respond to the requests?

Upvotes: 13

Answers (4)

Rotem jackoby

Reputation: 22208

I think that there are 2 different discussions here.

Domain: Availability

Topic: Master-slave architecture for K8S workloads

So while in the database world this is a common pattern, in K8S it is common for workloads to answer availability/fault tolerance requirements by using K8S native support for leader election.

Both patterns are pretty much related (Master-Slave Architecture can also be called Leader-Based Replication Architecture).

In K8S, it is common to see vendors that are providing controllers in a leader election topology where in case of failure to the leader, one of the replicas will immediately replace it (after all replicas have consensus about the new leader) - this provides the system with high availability.

In some cases all replicas are working but only the leader is the decision maker while in other cases only the leader is actually working while the other is on standby.

Domain: Performance and Scale

Topic: Mult-threading in a K8S pod

This is a bit tricky, and it is mainly depends on your use case.

Before K8S it was trivial to use threads if you wanted to improve performance.

K8S has multiple scaling mechanisms:

Pods can be scaled horizontally or vertically.
Nodes can be dynamically provisioned using a cluster autoscaler.

Due to this fact you should ensure that the logic that is creating the threads and managing the work for each thread is not something that should be managed by HPA.

For example, let's say you want:

To scale according to the number of messages in an AWS SQS queue.
One pod to process one message.
To improve the processing time of each message.

In this case you can set the HPA to handle the SQS logic and try to use threads for the internal processing logic of each message.

(*) Only if it makes sense - for example if you have a lot of I/O operations.
(**) And that it will not lead to issues like race condition.

But something that you need to be careful with, is NOT to mix the "external" logic that K8S in managing (HPA) with the "internal" logic (threads) that the application code is responsible for.

Upvotes: 0

Shahidh

Reputation: 2692

We use a deployment model in which a django based app is served by gunicorn with couple of worker processes. We have further tried to scale this pod to 2-3 replicas and have seen performance improvements.

It's totally upto what works for your app.

Advantage of scaling pods is that you can configure it dynamically, hence not wasting resources.

Upvotes: 2

nckturner

Reputation: 1266

Be conscious of having enough threads/processes/pods to maintain availability if your application blocks while serving each HTTP request (e.g. Django). There is going to be some pod startup time if you're using a horizontal pod autoscaler, and I found with a high traffic application I had much better availability with uwsgi and the application within each pod (same container), and a separate nginx pod doing reverse proxying and request pooling when all the uwsgi workers were busy.

YMMV but at the end of the day, availability is more important than sticking to the single process per pod rule of thumb. Just understand the downsides, such as less isolation between the processes within the same container. Logs are available on a per container basis, so there won't be isolation between anything in the same container using the built in kubectl logs functionality.

Upvotes: 10

sfgroups

Reputation: 19123

Recommended way to manage this in Kubernetes is to increase the number of PODs based on the workload requirements.

Upvotes: 5