Reputation: 711
What happens if a consumer starts reading from a topic, then the internet connection goes down before the consumer finishes reading? Does the message on the topic still remain? How does Kafka handle this kind of scenario?
Upvotes: 0
Views: 303
Reputation: 27346
Typically, queue consumers track explicit acknowledgements. That is, a consumer says "Thanks, I've processed that" and the server says "You're welcome".
Kafka handles this by storing an offset. The offset is the consumers position in the stream. For example, let's say I've got a stream with four elements in it.
A, B, C, D
At position one is A
, so a consumer with an offset of 0
will pull A
. Once they have processed A
, they will update their offset to 1
. It is common practice to store this on the broker side in the __consumer_offsets
topic.
When their offset becomes 1
, they get the next, which is B
. They process and increment their offset in the __consumer_offsets
topic, 2. So on and so forth.
So what happens during an outage mid read?
There is a timeline of events to consider during this outage:
__consumer_offsets
topic.Any error that happens before and including 4
will result in a simple re-request and reprocess. This means you'll need to handle something being half processed if your consumer is stateful.
An error that happens after 4
has completed but 5
has not completed will NOT result in a reprocess. Instead, it will re-establish the connection, update the offset and process the next item.
Upvotes: 4