Reputation: 23
We have a Kafka implementation of around 40 topics and each topic has 40 partitions and 3 replications. We have 7 brokers and 40 consumers in the Kafka setup. All nodes (for brokers and consumers) are of reasonable configurations, hosted and AWS and we hardly see any spikes in any of the machines. But somehow, we are finding that the consumer lag is very high, despite adding 40 consumers to cater the read for the above setup. This is despite the fact that we have only around 215 messages (each message is of around 2KB in size) of ingestion per second, to the above topics. Have tried everything possible, but we are not able to solve the lag issue.
We also see the consumers are most of the times sitting idle and consume the messages once in a while. Are 40 consumers enough to handle the above scenario (40 topics with 40 partitions and each topic has around 215 messages (430 K) of ingestions per second? Please help.
Upvotes: 2
Views: 14146
Reputation: 191743
It's not clear what group ID you've specified or what topics you are assigning to which consumer.
Assuming all consumers are reading from all topics (you subscribed to a pattern .*
), then you're missing out on 1560 partitions that can have dedicated consumer instances (40*40 total partitions in the cluster - 40 existing "active" consumer threads).
Since a consumer can only read from one partition at a time, not all partitions of a given topic at once, sounds to me like you'll need to add more consumers, ideally spread over several application instances / machines.
Upvotes: 2