ravthiru
ravthiru

Reputation: 9623

Kafka message ordering during consumer rebalance

How message ordering is ensured during consumer rebalance. Suppose initially we have four partitions : p1 , p2 , p3, p4 and two consumers c1 and c2 (in same Group). So each consumer gets two partitions for example c1 : p1, p2 and c2 : p3, p4.

Now new consumers are added say c3 and c4, rebalancing happens so that each consumer gets one Partition like c1 : p1, c2:p2, c3:p3, c4:p4.

During this time there are chances that consumer c1 might be processing the message from partition p2 (before rebalancing)

and consumer c2 also starts processing p2 messages (after rebalancing)

Even though this is corner case, is this expected behaviour of message ordering ?

Upvotes: 1

Views: 1859

Answers (2)

Matthias J. Sax
Matthias J. Sax

Reputation: 62285

During this time there are chances that consumer c1 might be processing the message from partition p2 (before rebalancing)

and consumer c2 also starts processing p2 messages (after rebalancing)

Yes. But how does this relate to message ordering? As long as there is no error, c1 should finish processing the current record (let's say with offset X) and after rebalance c2 will continue to process record with offset X+1.

And even if an error occurs and c1 fails to commit offset X -- c2 will reprocess some already processed messages but the order will still be preserved for partition p2.

A partitioned would only not be processed in-order, if a record with offset X1 would be processed before a record with offset X2 < X1. But that is never the case (you need to exclude reprocessing on caise of failure of course).

Long story short: yes, this is behavoir by design

If you build a stateless application and each record is processed independently this work very smooth. If you want state, you would need to make sure that the state of partition p2 it transferred from consumer c1 to c2 after rebalance (before c2 start to process data). Moving the state is actually a tough problem, and you should consider using Kafka Streams -- Kafka's stream processing library that can handling this for you automatically: http://docs.confluent.io/current/streams/index.html

Upvotes: 3

amethystic
amethystic

Reputation: 7079

There is actually no message ordering across partitions, so this is an expected behavior where C1 consumes P1 before C2 takes over it and starts to read after a rebalance.

Upvotes: 0

Related Questions