Aladin
Aladin

Reputation: 512

Apache Kafka KRaft mode topology best practices

My question concerns the recommended topology of Kafka brokers and controllers in KRaft mode.

Now, according to best practices with zookeeper we are supposed to create:

  1. {3,5,7} Zookeeper nodes
  2. {3,5,7} Three Kafka broker nodes

This is a well known structure that is recommended in every book and online course. But one of the drawbacks of this model is that we need to have at least 6 machines / nodes which is a lot.

Now, I'm afraid that in KRaft mode things might be different. The alternatives I see are the following:

  1. Three nodes where each node consists of a controller and a broker. I'm not sure it's a good one for production because once a single node is down (controller + broker), our system becomes fragile and we cannot afford loosing another node. Plus, I think it can introduce complications in case we want to update a node in production in case the other crashes.
  2. Six nodes: three separate controllers and three separate brokers - This is a good solution, it better handles some of the issues mentioned in (1), but I think we can find something better.
  3. Five nodes where each node is both a controller and a broker - I know that five nodes is reserved for heavy load systems, but I think that it's much better than to use model (2). Why should be use six machines when we can use five and have a much more reliable and available system? In other words, we can use a much better and cheaper solution.
  4. Hybrid - some standalone controller and brokers, and some mixed controllers and brokers - I'm not sure whether this model has some benefits.

The only thing that worries about model (3) is that I've not seen it in any other place so I'm not completely sure about it. Looking for your opinion and advise

Upvotes: 2

Views: 1981

Answers (2)

musmil
musmil

Reputation: 53

Please make sure your topic configurations like min.insync.replicas can be provided during a rolling update or restart. Example, you might have broker, controller set on same machine which has replicas on itself as a broker and might be needed to update or restart as a controller and because it has replicas on it may cause an offline topic.

When you separate your controller quorum with your brokers, this is avoided completely.

Upvotes: 0

visi0nary
visi0nary

Reputation: 53

I might be a little late here but your proposed third option is not really cheaper than the second if all five machines run as brokers. Brokers have way higher hardware requirements than controllers, so I would still go for the conservative second way. Controller nodes will need as little as maybe 8 GB RAM and some storage/CPU so you only need three machines (the brokers) that do the heavy lifting with lots of RAM and storage (and some CPU)

Upvotes: 1

Related Questions