How does kafka consumers read at a similar speed from different partitions?

Question

While working Spark Structured Streaming and Kinesis Streams, I have experienced imbalance reads when reprocessing data that has accumulated in the stream (as opposed to reading from the latest).

The next graph shows the difference in read speed of the kinesis shards that are part of the stream.

That makes spark jobs to drop a lot of events because events with very different event time get mixed up and those consider old are dropped.

Recently a team member suggested to use Kafka instead. I was a bit skeptical about Apache Kafka solving this issue because AFAIK the only way to fix the imbalance reads I described above is to introduce some kind of coordination at the consumer level. This is how the kinesis conector of Apache Flink provides alignment while reprocessing Kinesis streams (event-time-alignment-for-shard-consumers)

I have been investigating more in depth about the architecture and design on Apache Kafka but I can't see anything that resembles a coordination mechanism for consumer groups.

Still, after some tests, the reprocessing of messages from a Kafka topic is a lot more consistent among partitions. It seems like there is some coordination mechanism.

I'm aware that introducing a coordination among nodes in a distributed system like Kafka will decrease the throughput (which is the price to pay when using the shard alignment in the Flink connector for kinesis). That's what makes me even more curious, how can this possibly happen? How can Apache Kafka achieve this without a coordination mechanism?

How does kafka consumers read at a similar speed from different partitions?

Answers (1)

Related Questions