Ignacio Reyes Jainaga
Ignacio Reyes Jainaga

Reputation: 22

How can you config Apache Kafka to achieve 500 Mbps throughput in a high latency (200 ms) / high bandwidth (10 Gbps) connection?

I was looking for any advice on how to tune Apache Kafka in order to improve its throughput. My network might not be the most typical use case: High bandwidth (~10 Gbps), high latency (~200 ms), between the kafka servers and the consumer. I need to move data at >500 Mbps.

The current config has a topic with 45 partitions and 6 kafka brokers. Using a python script (confluent_kafka) and calling .consume(num_messages=30, timeout=5), with a single consumer group, I was able to reach about 30 Mbps.

I also tried consuming from an AWS EC2 instance closer to the kafka servers (~4 ms latency) and the peak throughput was 898 Mbps, so the latency was an important factor.

Then I performed a third experiment, launching 15 instances of the same script simultaneously, so kafka can distribute the partitions among the processes. With this setting, the peak throughput went from 30 Mbps to 290 Mbps.

I'm aware that tcp buffer sizes are something important in connections with a high delay-bandwidth, but I'm not sure what kafka settings are the ones that matter here. I was thinking about the following configs:

Do any of you have any experience with Kafka in a setting like this? Thanks!

Upvotes: -2

Views: 18

Answers (0)

Related Questions