Reputation: 77
Configuration: three redis cluster partitions across three sets of one master and one slave. When a Master goes down, Lettuce immediately detects the outage and begins retrying. However, Lettuce does not detect that the associated slave has promoted itself to master and continues to retry using the old master that is not reachable and eventually times out. Tried setting various topology refresh options to no avail.
Proposed solution: After the first retry fails (which is the second retry in a row to fail), rerun topology refresh (that was used to derive topology during initialization) using topology from any of the nodes provided (since they all have the same topology information). This will reestablish the connections to the now-current masters. Then retry the failed operation on the partition that previously failed.
Upvotes: 0
Views: 1807
Reputation: 18119
Redis Cluster is limited in terms of configuration update propagation compared to Redis Sentinel. Redis Sentinel communicates updates via Pub/Sub while Redis Cluster leaves polling as the sole option.
Lettuce supports periodic and adaptive cluster topology refresh triggers. Periodic updates topology in a regular interval, adaptive refresh listens to disconnects and cluster redirections.
You can configure both through ClusterClientOptions
.
Periodic and adaptive refreshes try to cover the most cases which are mostly guesswork compensating the lack of a proper configuration change propagation. There always are loopholes (see issue #672) in which Lettuce is faster than the actual topology change. This leaves Lettuce with an outdated topology view as the actual change happens somewhat later.
Upvotes: 2