Reputation: 99
I created a Kafka topic with 10 partitions and trying to consume messages through a single kafka consumer. However, kafka consumer is not reading messages from all partitions. More specifically, its consuming messages from 5 specific partitions only. Example : Consumer is consuming messages from [0,1,2,3,4] only. And after restarting if it starts consuming messages from [5,6,7,8 ,9] then it will only consume messages from these partitions. Here is output of kafka-consumer-offset-checker.sh command
Group|Topic | Pid | Offset | logSize| Lag | Owner GRP1 | topic1 | 0 | 128 | 175 | 47 | none GRP1 | topic1 | 1 | 117 | 146 | 29 | none GRP1 | topic1 | 2 | 62 | 87 | 25 | none GRP1 | topic1 | 3 | 101 | 143 | 42 | none GRP1 | topic1 | 4 | 104 | 145 | 41 | none GRP1 | topic1 | 5 | 118 | 118 | 0 | none GRP1 | topic1 | 6 | 111 | 111 | 0 | none GRP1 | topic1 | 7 | 161 | 161 | 0 | none GRP1 | topic1 | 8 | 144 | 144 | 0 | none GRP1 | topic1 | 9 | 171 | 171 | 0 | none
Does anyone know why its happening..?
Upvotes: 10
Views: 12644
Reputation: 655
Kafka Consumer recommended configuration
To set up a single partition per consumer, Kafka configuration needs to be designed in a right way. I would advise that you should have equal number of partitions per single threaded consumer per topic.
This means if you want to set up 5 consumer exclusively consuming single partition, you have to create topic having 5 partitions. In your case you might have to reduce the number of partitions using ./bin/kafka-topics.sh --zookeeper localhost:9092 --alter --topic testKafka --partitions 5
command.
As per your question, since topic is getting consumed partially, the consumers might have been configured into consumer groups. Another consumer in same group might be consuming remaining partitions & since the consumers are less than partitions, the partitions are inactive.
Upvotes: 1
Reputation: 344
I was having a similar issue this week while using spark streaming to read from a kafka topic with 32 partitions. Specifically we were using the spark kafka streaming classes provided by apache org.apache.spark.streaming.kafka010.*.
We were only able to consume from a single partition. The issue was because we were including kafka version 0.10.1.0
with our jar. Reverting to 0.10.0.1
fixed it even though our cluster is on 0.10.1.0
.
Upvotes: 0