ethrbunny
ethrbunny

Reputation: 10469

kafka: replicas and ISR don't match

Im moving hundreds of topics from one broker to another. The process is

  1. Use kafka-topics.sh to generate existing partition list
  2. Use kafka-reassign-partitions.sh to generate current list of partitions / brokers / etc
  3. Edit this list so every instance of broker 7(to be replaced) is now brokers 7,4 (4 is new broker)
  4. Run kafka-reassign-partitions.sh (broker list) --execute to add new broker
  5. Wait.. watch.. (using --verify).. until complete...
  6. Edit broker list to be brokers 4,7
  7. Do (4) again.. do (5) again..
  8. Run preferred leader election (in case 7 was leading anything)
  9. Edit broker list to remove all instances of broker 7
  10. Do (4) and (5) again
  11. Be happy

This has worked great for hundreds and hundreds of topics.. except for one sticky one. This holdout is refusing to sync up with the new broker (missing from ISR list) even though it's included in list of replicas

Output from kafka-topics.sh (trying to replace broker 7 with broker 4):

Topic: shard_3 Partition: 7 Leader: 3 Replicas: 3,4,7 Isr: 7,3

I've run (4) above several times in hopes of getting this to complete but it doesn't seem to want to. I've waited overnight in case it's just really slow.

Suggestions on how to unstick this one?

Upvotes: 2

Views: 6019

Answers (1)

ethrbunny
ethrbunny

Reputation: 10469

Turns out the lead broker was upset about something and the partition list wasn't being kept up to date.

Solution:

  1. bin/zkCli.sh -server <kafka broker in cluster>
  2. get /controller
  3. restart the kafka service on that controller - this passes control to another box
  4. retry partitioning commands

Upvotes: 3

Related Questions