alph486
alph486

Reputation: 1249

Kafka Producers/Consumers over WAN?

I have a Kafka Cluster in a data center. A bunch of clients that may communicate across WANs (even the internet) will send/receive real time messages to/from the cluster.

I read from Kafka's Documentation:

...It is possible to read from or write to a remote Kafka cluster over the WAN though TCP tuning will be necessary for high-latency links.

It is generally not advisable to run a single Kafka cluster that spans multiple datacenters as this will incur very high replication latency both for Kafka writes and Zookeeper writes and neither Kafka nor Zookeeper will remain available if the network partitions.

From what I understand here and here:

Aren't then clients reading/writing to Kafka over a WAN subject to the same limitations for clusters in bold above?

Upvotes: 4

Views: 5088

Answers (1)

ppearcy
ppearcy

Reputation: 2762

The statements you have highlighted are mostly targeted at the internal communication between the Kafka/zookeeper cluster where evil things will happen during network partitions which are much more common across a WAN.

Producers are isolated and if there are network issues should be able to buffer/retry based on your settings.

High level consumers are trickier since, as you note, require a connection to zookeeper. Here when disconnects occur, there will be rebalancing and a higher chance messages will get duplicated.

Keep in mind, the producer will need to be able to get to every Kafka broker and the consumer will need to be able to get to all zookeeper nodes and Kafka brokers, a load balancer won't work.

Upvotes: 2

Related Questions