Reputation: 255
Say, I am having a kafka topic with 10 partitions. When a data rate is increased, I can increase the partitions to speed up my processing logic.
But my doubt is that, whether increasing the partitions is good or can I go for topic split up (That is, Based on my application logic, some data will go for topic 1 and some data to topic2. So by doing this, I can split the data rate to two topics)
Whether choosing new topic rather than increasing partitions or increasing partitions rather than creating new topic will have any performance impact on kafka cluster?
Which one will be the best solution?
Upvotes: 2
Views: 2003
Reputation: 26865
It depends!
It is usually recommended to slightly over-partition topics that are likely to increase in throughput so you don't have to add partitions when this happens.
The main reason is that if you're using keyed messages adding partitions will change the key-partition mappings. So after having added partitions, messages with a key won't go to the same partition than before. If you need ordering per key this can be problematic.
Adding partitions is usually easier as consumers and producers won't need updates. You will just be able to add consumers to scale. You also keep all events together and have to worry about a single topic. Depending on the size of your cluster, with only 10 partitions you probably still have a lot of leeway to add partitions. From Kafka's point of view, 10 partitions is pretty small and you can easily have 50 or even more.
On the other hand, when creating new topics, clients will need to be updated to use them. Nvertheless, that could be a solution if over time you start receiving more types of events and want to reorder them across several topics.
Upvotes: 8