Jitu
Jitu

Reputation: 329

Kafka | Increase replication factor of multiple topics

I have a 3 broker Kafka cluster with many topics with replication factor 1. I know I can increase it by passing JSON file with the partition reassignment configuration to kafka-reassign-partitions.sh.

My confusion is should I pass a single JSON file with partition reassignment details of all topics or should I create a JSON for each topic and run them individually?

Upvotes: 2

Views: 4845

Answers (2)

Mickael Maison
Mickael Maison

Reputation: 26875

This a balance of cost / risk.

  1. Reassigning all topics together:

    • Pros: easy to run, single command. Single task to monitor
    • Cons: Not a lot of control. Depending on your cluster, a lot of data could be copied by the process. While you can set reassignment quotas, it can be hard to precisely control the bandwidth used by the reassignment. Hence this can affect other services using the cluster
  2. Reassigning topics in "small" chunks:

    • Pros: This allows more control over the impact a large reassignment can have
    • Cons: Operators have to split the reassignment. Run and monitor each chunk

Depending on the size and usage of your cluster, you should be able to identify which method is the best for you. In a busy cluster, I'd recommend setting reasignment quotas and only reassigning topics by chunks as otherwise reassignment will try to execute as fast as possible and this can impact the cluster greatly. If your cluster is mostly fresh/unused then you may be able to reassign all topics at the same time.

Upvotes: 1

Giorgos Myrianthous
Giorgos Myrianthous

Reputation: 39810

You can either create multiple .json files or use a single file that contains reassignment details for more than one topic:

{
  "version":1,
  "partitions":[
      {"topic":"topic_1","partition":0,"replicas":[0,1]},
      {"topic":"topic_1","partition":1,"replicas":[1,0]}, 
      {"topic":"topic_2","partition":0,"replicas":[0,1]},
      {"topic":"topic_2","partition":1,"replicas":[1,0]}
  ]
}

And then run

./bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file increase-replication-factor.json --execute

Your topics should look like below:

./bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic topic_1
Topic:demo-topic        PartitionCount:2        ReplicationFactor:2     Configs:
        Topic: topic_1       Partition: 0    Leader: 0       Replicas: 0,1     Isr: 0,1
        Topic: topic_1       Partition: 1    Leader: 1       Replicas: 1,0     Isr: 1,0

./bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic topic_2
Topic:demo-topic        PartitionCount:2        ReplicationFactor:2     Configs:
        Topic: topic_2       Partition: 0    Leader: 0       Replicas: 0,1     Isr: 0,1
        Topic: topic_2       Partition: 1    Leader: 1       Replicas: 1,0     Isr: 1,0

Finally, Finally, the --verify option can be used with the tool to check the status of the partition reassignment. Note that the same expand-cluster-reassignment.json (used with the --execute option) should be used with the --verify option

> bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file increase-replication-factor.json  --verify
Status of partition reassignment:
Reassignment of partition [topic_1,0] completed successfully
Reassignment of partition [topic_1,1] is in progress
Reassignment of partition [topic_2,0] completed successfully
Reassignment of partition [topic_2,1] completed successfully 

Upvotes: 2

Related Questions