mmkk18
mmkk18

Reputation: 93

Kafka Producer: Understanding `retries` config param

Kafka Documentation for Producer Config parameter retries says that retry will happen for potentially transient errors.

Setting a value greater than zero will cause the client to resend any record whose send fails with a potentially transient error. Note that this retry is no different than if the client resent the record upon receiving the error. Allowing retries without setting max.in.flight.requests.per.connection to 1 will potentially change the ordering of records because if two batches are sent to a single partition, and the first fails and is retried but the second succeeds, then the records in the second batch may appear first.

  1. What are these potentially transient errors? What scenarios cause these 'transient errors'?
  2. In the source code, I see that there is a RetriableException Interface and TimeoutException implements it. But I have observed that retry does not happen for every timeout exception thrown by send(ProducerRecord<K,V> record) method of org.apache.kafka.clients.producer. So, what I am missing here ... Aren't all RetriableException retried when retries param is greater than 0?

Upvotes: 2

Views: 1416

Answers (1)

unconditional
unconditional

Reputation: 7666

Regarding the "potentially transient" and "permanent" errors it's quite simple.

Potentially transient ones are temporary errors that have a chance of resolving automatically within reasonable time: connection errors, temporary states (no leader for partition), etc.

Permanent errors on the other hand are the ones that cannot be resolved with retries and most likely require a manual intervention: e.g. if a topic does not exist and is not auto-created then it'll require some external entity (such as a human) to create it.

You can find which errors are retriable for example here: https://kafka.apache.org/25/generated/protocol_errors.html

Regarding the retires in the producer code, it looks like they are done on a different level. If you trace the usage of RETRIES_CONFIG you'll see that org.apache.kafka.clients.producer.KafkaProducer passes that to create a org.apache.kafka.clients.producer.internals.Sender which is actually "The background thread that handles the sending of produce requests to the Kafka cluster." It has the canRetry() method where the decision is made whether to retry the send or not.

Upvotes: 0

Related Questions