Tronyx
Tronyx

Reputation: 29

Kafka MirrorMaker Woes

Basically, MM is replicating MORE than I need it to.

I have four environments, DEV01, DEV02, TST01, and TST02, that each have two Servers running the same App that is generating JSON files. Logstash is reading those files and pushing messages into two, three node Kafka Clusters, KAF01 & KAF02. The DEV01 & TST01 boxes push to the KAF01 Cluster, with corresponding DEV01 & TST01 topics, and the DEV02 & TST02 boxes push to the KAF02 Cluster, with corresponding DEV02 & TST02 topics. Logstash is running on each of the Kafka nodes to then push the messages into corresponding Elasticsearch Clusters. This all works as expected. I then added in MM to replicated messages between environments, IE: DEV01<->DEV02, TST01<->TST02. I started the MM process for the DEV environments and everything worked fine. Then, on the same Hosts, I started a 2nd MM process for the TST environments and everything seemed fine until I realized that I was seeing messages from TST in DEV Elasticsearch and vice versa.

Here's a rough diagram of the flow:

Flow Diagram

I have MM running on the first Hosts in each Kafka Cluster, IE: kaf01-01 & kaf02-01. For the KAF01 Cluster, kaf01-01 is setup to mirror both the dev01 & tst01 topics to the KAF02 Cluster:

kafka-mirror-maker.sh kafka.tools.MirrorMaker --consumer.config dev01_mm_source.properties --num.streams 1 --producer.config dev01_mm_target.properties --whitelist="dev01"

For --consumer.config, the dev01_mm_source.properties file is configured with the KAF01 Cluster nodes. For --producer.config, the dev01_mm_target.properties file is configured with the KAF02 Cluster nodes.

kafka-mirror-maker.sh kafka.tools.MirrorMaker --consumer.config tst01_mm_source.properties --num.streams 1 --producer.config tst01_mm_target.properties --whitelist="tst01"

For --consumer.config, the tst01_mm_source.properties file is configured with the KAF01 Cluster nodes. For --producer.config, the tst01_mm_target.properties file is configured with the KAF02 Cluster nodes.

For the KAF02 Cluster, kaf02-01 is setup to mirror both the dev02 & tst02 topics to the KAF01 Cluster:

kafka-mirror-maker.sh kafka.tools.MirrorMaker --consumer.config dev02_mm_source.properties --num.streams 1 --producer.config dev02_mm_target.properties --whitelist="dev02"

For --consumer.config, the dev02_mm_source.properties file is configured with the KAF02 Cluster nodes. For --producer.config, the dev02_mm_target.properties file is configured with the KAF01 Cluster nodes.

kafka-mirror-maker.sh kafka.tools.MirrorMaker --consumer.config tst02_mm_source.properties --num.streams 1 --producer.config tst02_mm_target.properties --whitelist="tst02"

For --consumer.config, the tst02_mm_source.properties file is configured with the KAF02 Cluster nodes. For --producer.config, the tst02_mm_target.properties file is configured with the KAF01 Cluster nodes.

Do I have things mixed up? Do I have the --consumer.config and --producer.config files backwards? Is the regex for the --whitelist option that I'm using incorrect? Not really using regex either, just a quoted string. I've triple-checked that Logstash on all of the App boxes is configured to push to the correct Kafka topic and that Logstash on the Kafka boxes is configured to pull from the correct Kafka topic and then push to the correct Elasticsearch Cluster.

Just started working with Kafka and MM today so I'm totally new to all of this and any/all help is greatly appreciated.

Upvotes: 0

Views: 911

Answers (1)

Tronyx
Tronyx

Reputation: 29

I have figured this out. I was trying to have Logstash output to two different ES Clusters, which a single instance of Logstash apparently cannot do, so it was mushing them together. MirrorMaker is working as expected. I've changed where Logstash is running, on each of the Elasticsearch Nodes themselves to pull from the Kafka topics, to separate this out more and everything is now working as expected.

Upvotes: 1

Related Questions