Andremoniy
Andremoniy

Reputation: 34920

A kafka record is acknowledged but no data returned to consumer

There is a Kafka (version 2.2.0) cluster of 3 nodes. One node becomes artificially unavailable (network disconnection). Then we have the following behaviour:

  1. We send a record to a producer with the given topic-partition (to the specific partition, let's say #0).

  2. We receive a record metadata from the producer what confirms that it has been acknowledged.

  3. Immediately after that we poll a consumer assigned to the same topic-partition and an offset taken from the record's metadata. The poll timeout was set to 30 seconds. No data is returned (an empty set is returned).

This happens inconsistently from time to time (under described circumstances with one Kafka node failure).

Essentially my question is: should data be immediately available for consumers ones it is acknowledged? What the reasonable timeout for that if not?

UPD: some configuration details:

Upvotes: 2

Views: 953

Answers (1)

Robert Bräutigam
Robert Bräutigam

Reputation: 7744

The default setting of acks on the producer is 1. This means the producer waits for the acknowledgement from the leader replica only. If the leader dies right after acknowledging, the message won't be delivered.

Should data be immediately available for consumers? Yes, in general there should be very little lag per default, should be effectively on the milliseconds range per default and without load.

If you want to make sure that a message can't be lost, you have to configure the producer to "acks=all" in addition to min.insync.replicas=2. This will make sure all in sync replicas acknowledge the message, and that minimum 2 nodes do. So you are still allowed to lose one node and be fine. Lose 2 nodes and you won't be able to send, but even then messages won't be lost.

Upvotes: 1

Related Questions