Bob
Bob

Reputation: 319

Kafka cluster increasing replica factor doesn't work

Hi I run into a strange issue with increasing Kafka's replica factor when following the steps in this document: https://kafka.apache.org/documentation/#basic_ops_increase_replication_factor

The symptom looks like replica factor increase doesn't work at all.

Please help

My Kafka setup is

Kafka version: kafka_2.12-2.1.0

Server: hostname server-0 (192.168.0.1)

Server: hostname server-1 (192.168.0.2)

Topics

The DATA topic is created with replica-factor 1 from server-0 only first

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic DATA

result looks like

bin/kafka-topics.sh --zookeeper localhost:2181 --describe --topic DATA Topic:DATA PartitionCount:1 ReplicationFactor:1 Configs: Topic: DATA Partition: 0 Leader: 0 Replicas: 0 Isr: 0

after creating the topic, I produced some test message

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic DATA message 1 message 2

Then the replica factor of topic DATA is increased to 2 by running commands in server-0 only

below json file is used with Kafka-reassign-partitions.sh to increase the replica-factor

{ "version":1, "partitions":[ {"topic":"DATA","partition":0,"replicas":[0,1]} ] }

command line:

bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file topics-to-expand.json --execute

On the surface, the result looks good by describing the topics

bin/kafka-topics.sh --zookeeper localhost:2181 --describe --topic DATA Topic:DATA PartitionCount:1 ReplicationFactor:2 Configs: Topic: DATA Partition: 0 Leader: 0 Replicas: 0,1 Isr: 0,1

I produced some more test messages here

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic DATA message 3 message 4

However the problem arises when I tried to test from server-1

Now I killed the kafka process from server-0 by

kill -9 [kafka-pid]

The problem happens when I run the console-consumer from server-1

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic DATA --from-beginning

There are no messages shows up and the console just blocks at blank screen.

I think according to the document, I should be able to see the messages because replica is/was in-sync? No?

Describing the topic shows

bin/kafka-topics.sh --zookeeper server-0:2181 --describe --topic DATA Topic:DATA PartitionCount:1 ReplicationFactor:2 Configs: Topic: DATA Partition: 0 Leader: 1 Replicas: 0,1 Isr: 1

Then I restarted the kafka process from server-0, the consumer console screen all of sudden shows all the messages in history

message 1 message 2 message 3 message 4

It looks like that the consumer from server-1 didn't consume any data from server-1 locally because topic data is not replicated to server-1. Instead, it still waits for server-0 to come back up to supply the data. Even server-1 is marked as leader.

Can anyone replicate my problem? I want to attach my properties but I don't know how to attach files in stackoverflow sorry about that...

Upvotes: 0

Views: 2336

Answers (1)

Bob
Bob

Reputation: 319

Inspired by this post and figured out why.

Killing node with __consumer_offsets leads to no message consumption at consumers

The reason of my above symptom is because default offsets.topic.replication.factor=3 but I only have 2 brokers (nodes) in the cluster. When Kafka first created __consumer_offsets topic, it fails back to offsets.topic.replication.factor=1 silently (yaks).

Changing offsets.topic.replication.factor=2 in property file solves above problem. (yes tested!)

Upvotes: 1

Related Questions