Reputation: 893
The author of the article "Kafka in a Nutshell" (at https://sookocheff.com/post/kafka/kafka-in-a-nutshell/) states that:
Kafka makes the following guarantees about data consistency and availability: (1) Messages sent to a topic partition will be appended to the commit log in the order they are sent, (2) a single consumer instance will see messages in the order they appear in the log, (3) a message is ‘committed’ when all in sync replicas have applied it to their log, and (4) any committed message will not be lost, as long as at least one in sync replica is alive.
The first and second guarantee ensure that message ordering is preserved for each partition. Note that message ordering for the entire topic is not guaranteed. ...
I'm curious as to what the author meant when he said:
Note that message ordering for the entire topic is not guaranteed.
Upvotes: 1
Views: 1149
Reputation: 3852
Kafka topic consists of multiple partitions where message get appended to each partition based on key hashing or partitioner rule (random, round-robin, custom, etc.)
Topic partition basically parallelize process by distributing message across partition
Hence Kafka guarantee order on each partition but since message gets distributed across partition we can not guarantee order globally or per topic
As above diagram producer publishing message to the topic but it gets appended sequentially to any of the partition hence
E.g. assume partition selection is a round-robin
message 1 -> publish to p1 in position 1
message 2 -> publish to p2 in position 1
message 3 -> publish to p2 in position 3
message 4 -> publish to p1 in position 2
message 5 -> publish to p2 in position 2
and so on so consumers can consumer messages but could not be in the same order as producers.
If you want to have a global ordering, you will need to have only 1 partition.
Upvotes: 2