Apache Kafka- Algorithm/Strategy used to pull messages from different partitions of a same topic by a Single Consumer

Question

I have been studying Apache Kafka for a while now.

Lets consider the following example.

Consider I have a topic with 3 partitions. I have a single producer and single consumer. I am producing my messages without specifying the key attribute.

So i know on the producer side, when i publish a message, the strategy used by kafka to assign a message to either of those partitions would be Round-Robin.

Now, what i want to know is when I start a single consumer belonging to a certain consumer group listening to that same topic, what strategy will it use to pull the messages from the different partitons(as there are 3)?

Would it follow the a similar round-robin model, where it will send a fetch request to a leader of a partition 1, wait for a response, get the response, return the records to process. Then, send a fetch request to the leader of a partition 2 and so on?

If it follows some other strategy/algorithm, I would love to know what it is?

Thank you in advance.

dawsaw · Accepted Answer

There is no ordering guarantee outside of a partition so in a way that algorithm used is moot to the end user and subject to change.

Today, there is nothing terribly complex that happens in this instance. The protocol shows you that a fetch request includes a partition so you get a fetch per partition. That means the order depends on the consumer. A partition won't be starved because fetch requests will happen for all partitions assigned to the consumer.

Apache Kafka- Algorithm/Strategy used to pull messages from different partitions of a same topic by a Single Consumer

Answers (1)

Related Questions