Kubernetes workload scaling on multi-threaded code

Question

Getting started with Kubernetes so have the following question:

Say a microservice has the following C# code snippet:

   var tasks = _componentBuilders.Select(b =>
    {
       return Task.Factory.StartNew(() =>  b.SetReference(context, typedModel));
    });

    Task.WaitAll(tasks.ToArray());

On my box, I understand that each thread be executed on a vCPU. So if I have 4 cores with hyperthreading enabled I will be able to execute 8 tasks concurrently. Therefore, if I have about 50000 tasks, it will take roughly

(50,000/8) * approximate time per task

to complete this work. This ignores context switch, etc.

Now, moving to the cloud and assuming this code is in a docker container managed by Kubernetes Deployment and we have a single docker container per VM to keep this simple. How does the above code scale horizontally across the VMs in the deployment? Can not find very clear guidance on this so if anyone has any reference material, that would be helpful.

David Maze · Accepted Answer

You'll typically use a Kubernetes Deployment object to deploy application code. That has a replicas: setting, which launches some number of identical disposable Pods. Each Pod has a container, and each pod will independently run the code block you quoted above.

The challenge here is distributing work across the Pods. If each Pod generates its own 50,000 work items, they'll all do the same work and things won't happen any faster. Just running your application in Kubernetes doesn't give you any prebuilt way to share thread pools or task queues between Pods.

A typical approach here is to use a job queue system; RabbitMQ is a popular open-source option. One part of the system generates the tasks and writes them into RabbitMQ. One or more workers reads jobs from the queue and runs them. You can set this up and demonstrate it to yourself without using container technology, then repackage it in Docker or Kubernetes just changing the RabbitMQ broker address at deploy time.

In this setup I'd probably have the worker run jobs serially, one at a time, with no threading. That will simplify the implementation of the worker. If you want to run more jobs in parallel, run more workers; in Kubernetes, increase the Deployment replica: count.

Kubernetes workload scaling on multi-threaded code

Answers (2)

Related Questions