조서환
조서환

Reputation: 71

Questions about Kafka's partitions and throughput

We are doing RND to use Kafka in our company. While learning about Kafka, I have the following questions.

I know that Kafka can assign 1 consumer per 1 partition and can only process 1 message.

So, under the following assumptions:

When the processing target of our service is 200,000 TPS,

  1. Do we need to create 200,000 partitions to implement this? 1-1. What problems occur when partitions are created in large quantities like the above?
  2. Is Kafka horizontally scalable? In other words, if we add more servers, will there be any problems when creating and managing more partitions?

This question was asked to see if it is possible to flexibly handle the desired throughput by adding additional servers when the target is fluid.

Upvotes: -1

Views: 27

Answers (1)

Rishabh Sharma
Rishabh Sharma

Reputation: 862

  1. Yes that is the most straight forward way to go about it. Alternatively, you can configure your consumer to read n messages and process them in n threads. For e.g. your consumer could read 1000 messages and process them in 1000 threads. You would end up needing 200 such consumers to reach 200,000 TPS. Your partition count also needs to be 200 for this. The disadvantage of such strategy would be error handling - you commit the whole batch of messages. So in case of error of even 1 message, you would have to retry the whole batch. Depending on your frequency of errors, you can play around with consumer/message batch size numbers.
  2. Theoretically speaking, having 200K partitions on Kafka Topic should have no issues (this I have not seen in practice to please take this with a pinch of salt). In Kafka, topics are just logical grouping of partitions - since kafka can handle such partition count across multiple topics, it should also be able to handle this count in a single topic.
  3. Adding a new kafka server is a easy process. However, kafka does not do any re-distribution of topics across the new server automatically. Let us say you are running 2 kafka servers and your Kafka topic has 30 partitions distributed across them evenly (15 parititions on each server). If you add a new 3rd kafka server, the topic configuration remains unchanged (15 partitions on 2 servers and 0 partition on 1 server). Now, you can choose to either reconfigure the topic so as to have 10 partitions on each server or even add new partitions to the topic and configure them to be created on the 3rd server only.

Upvotes: 0

Related Questions