kramsiv94
kramsiv94

Reputation: 711

Kafka: What happens when the connection is interrrupted before consumer finishes reading from a topic?

What happens if a consumer starts reading from a topic, then the internet connection goes down before the consumer finishes reading? Does the message on the topic still remain? How does Kafka handle this kind of scenario?

Upvotes: 0

Views: 303

Answers (1)

christopher
christopher

Reputation: 27346

Typically, queue consumers track explicit acknowledgements. That is, a consumer says "Thanks, I've processed that" and the server says "You're welcome".

Kafka handles this by storing an offset. The offset is the consumers position in the stream. For example, let's say I've got a stream with four elements in it.

A, B, C, D

At position one is A, so a consumer with an offset of 0 will pull A. Once they have processed A, they will update their offset to 1. It is common practice to store this on the broker side in the __consumer_offsets topic.

When their offset becomes 1, they get the next, which is B. They process and increment their offset in the __consumer_offsets topic, 2. So on and so forth.

So what happens during an outage mid read?

There is a timeline of events to consider during this outage:

  1. Consumer requests the next item in the topic, based on its offset.
  2. Consumer begins reading the next item in the topic.
  3. Consumer finishes reading the item in the topic.
  4. Consumer processes the item in the topic.
  5. Consumer updates its offset in the __consumer_offsets topic.
  6. Go back to 1.

Any error that happens before and including 4 will result in a simple re-request and reprocess. This means you'll need to handle something being half processed if your consumer is stateful.

An error that happens after 4 has completed but 5 has not completed will NOT result in a reprocess. Instead, it will re-establish the connection, update the offset and process the next item.

Upvotes: 4

Related Questions