Reputation: 2233
I have two consumers with different client ID's and group ID's. Aside from retention hour and max partitions, my Kafka installation contains default configuration. I've looked around to see if anyone else has had the same issue but can't pull up any results.
So the scenario goes like this:
Consumer A: Connects to Kafka, consumes about 3 million messages that need to be consumed, and then sits idle waiting for more messages.
Consumer B: Different client / group ID, connects to the same Kafka topic, and this causes consumer A to get a repeat of the 3 million messages while consumer B consumes them as well.
The two consumers are two completely different Java applications with different client and group ID's running on the same computer. The Kafka server is on another computer.
Is this a normal behavior in Kafka? I am at a complete loss.
Here is my consumer config:
bootstrap.servers=192.168.110.109:9092
acks=all
max.block.ms=2000
retries=0
batch.size=16384
auto.commit.interval.ms=1000
linger.ms=0
key.deserializer=org.apache.kafka.common.serialization.StringDeserializer
value.deserializer=org.apache.kafka.common.serialization.StringDeserializer
block.on.buffer.full=true
enable.auto.commit=false
auto.offset.reset=none
session.timeout.ms=30000
zookeeper.session.timeout=100000
rebalance.backoff.ms=8000
group.id=consumerGroupA
zookeeper.connect=192.168.110.109:2181
poll.interval=100
And the obvious difference in my consumer B is the group.id=consumerGroupB
Upvotes: 0
Views: 692
Reputation: 625
This is a correct behavior. Because based on your configs, your consumers don't commit offset of records that they have read!
When a consumer read a record, it must commit reading it, you can ensure that consumers commit offsets automatically by setting enable.auto.commit=true
or commit each record manually. In this case I think auto commit is fine for you.
Upvotes: 2