Hector
Hector

Reputation: 5418

How to Improve Performance of Kafka Producer when used in Synchronous Mode

I have developed a Kafka version : 0.9.0.1 application that cannot afford to lose any messages.

I have a constraint that the messages must be consumed in the correct sequence.

To ensure I do not loose any messages I have implemented Retries within my application code and configured my Producer to ack=all.

To enforce exception handling and to Fail Fast I immediately get() on the returned Future from Producer.send(), e.g.

final Future<RecordMetadata> futureRecordMetadata = KAFKA_PRODUCER.send(producerRecord);
futureRecordMetadata.get();

This approach works fine for guaranteeing the delivery of all messages, however the performance is completely unacceptable.

For example it takes 34 minutes to send 152,125 messages with ack=all.

When I comment out the futureRecordMetadata.get(), I can send 1,089,125 messages in 7 minutes.

When I change ack=all to ack=1 I can send 815,038 in 30 minutes. Why is there such a big difference between ack=all and ack=1?

However by not blocking on the get() I have no way of knowing if the message arrived safely.

I know I can pass a Callback into the send and have Kafka retry for me, however this approach has a drawback that messages may be consumed out of sequence.

I thought request.required.acks config could save the day for me, however when I set any value for it I receive this warning

130 [NamedConnector-Monitor] WARN org.apache.kafka.clients.producer.ProducerConfig - The configuration request.required.acks = -1 was supplied but isn't a known config.

Is it possible to asynchronously send Kafka messages, with a guarantee they will ALWAYS arrive safely and in the correct sequence?

UPDATE 001

Is there anyway I can consume messages in kafka message KEY order direct from the TOPIC?

Or would I have to consume messages in offset order then sort programmatically to Kafka message Key order?

Upvotes: 4

Views: 3815

Answers (1)

Shawn Guo
Shawn Guo

Reputation: 3228

If you expect a total order, the send performance is bad. (actually total order scenario is very rare).
If Partition order are acceptable, you can use multiple thread producer. One producer/thread for each partition.

Upvotes: 4

Related Questions