Reputation: 988
I have a 5-node SolrCloud (Solr 7.0) with an external 3-node Zookeeper ensemble. There is one collection called "production" that is sharded to 5 shards with a replication factor of 5. See the screenshot below:
shard5 was struggling to elect a new leader for a long time and other cores were complaining with the following error:
azsolr1 solr: 2018-08-28 19:32:43.575 ERROR (qtp1124317168-9304) [c:production s:shard2 r:core_node9 x:production_shard2_replica_n4] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: No registered leader was found after waiting for 4000ms , collection: production slice: shard5
After restarting all nodes one by one (I even restarted the zookeeper nodes), I had no luck in electing the only active replica (azsolr1) as the leader. I then unloaded the 4 replicas with the 'down' state using the CoreAdmin API UNLOAD command which caused the replicas to disappear completely.
With that setup, trying to force the leader of the shard using the Collection API FORCELEADER does nothing. I also tried this before unloading the cores.
Here is the current status:
Why can't Solr just elect the only active replica for shard 5 as the leader? Isn't this obvious, especially after forcing the leader on the shard?
Assuming the leader was elected successfully somehow, do I recreate the replicas that I deleted using the Collection API ADDREPLICA? In this case, should I reuse the same instanceDir
and dataDir
of the deleted replicas? Or I just let it replicate from scratch?
Upvotes: 5
Views: 9348
Reputation: 4352
I had the same problem.
one collection with 3 replicas (solr1 --> was a leader before, solr2, solr3). one of the shards has no leader! and I did these steps :
1 - stop solr2 and solr3
2- call FORCE LEADER API (http://xx.xx.xxx.xx:8983/solr/admin/collections?action=FORCELEADER&collection=your_collection_name&shard=shard1
)
3 - after a few minutes solr1 elected as a leader
Upvotes: 3
Reputation: 988
Restarting azsolr1
which was hosting the only replica for shard5
forced the election of the leader. Sounds crazy, but that was it.
After doing that, I added the other 4 replicas using the ADDREPLICA
command.
Upvotes: 3