user432024
user432024

Reputation: 4665

How to guarantee order in Kafka partition

Ok so I understand that you only get order guarantee per partition.

Just random thought/question.

Assuming that the partition strategy is correct and the messages are grouped correctly to the proper partition (or even say we are using 1 partition)

I suppose that the producing application must send each message 1 by 1 to kafka and make sure that each message has been acked before sending the next one right?

Upvotes: 6

Views: 5192

Answers (4)

Smallriver
Smallriver

Reputation: 51

There are two strategies for sending messages in kafka : synchronous and asynchronous.

For synchronous type, it is intuitively that a producer send message one by one to the target partition, thus the message order is guaranteed.

For asynchronous type, messages are send using batching method, that is to say, if M1 is send prior to M2, then M1 is accumulated in the memory first, then the same with M2. So When producer sends batches of messages in a single request, the messages order thus will be guaranteed.

Upvotes: 0

Binita Bharati
Binita Bharati

Reputation: 5888

Yes, the Producer should be single threaded. If one uses multiple Producer threads to produce to the same partition, ordering guarantee on the Consumer will still be lost.So, ordering guarantee on the same partition implicitly also means a single Producer thread.

Upvotes: 0

pherris
pherris

Reputation: 17693

Yes, you are correct that the order the producing application sends the message dictates the order they are stored in the partition.

Messages sent by a producer to a particular topic partition will be appended in the order they are sent. That is, if a message M1 is sent by the same producer as a message M2, and M1 is sent first, then M1 will have a lower offset than M2 and appear earlier in the log. http://kafka.apache.org/documentation.html#intro_guarantees

However, if you have multiple messages in flight simultaneously I am not sure how order is determined.

You might want to think about the acks config for your producer as well. There are failure conditions where a message may be missing if the leader goes down after M1 is published and a new leader receives M2. In this case you won't have an out of order condition, but a missing message so it's slightly orthogonal to your original question but something to consider if message guarantees and order are critical to your application. http://kafka.apache.org/documentation.html#producerconfigs

Overall, designing a system where small differences in order are not that important can really simplify things.

Upvotes: 8

Shawn Guo
Shawn Guo

Reputation: 3218

sync send message one by one(definitely slow!),
or async send message in batch with max.in.flight.requests.per.connection = 1

Upvotes: 5

Related Questions