Reputation: 1185
I have 3 ActiveMQ brokers in a networked Shared File System(GlusterFS)/Master Slave configuration - all in VMs.
If the master fails the client should failover to the new master.
The issue I have is that the connection to the new master takes about 50 seconds.
Is that reasonable?
How to improve it?
My client connection looks like this
failover:(tcp://a1:61616?connectionTimeout=1000,tcp://a2:61616?connectionTimeout=1000,tcp://a3:61616?connectionTimeout=1000)?randomize=false&maxReconnectDelay=10000&backup=true"
Also when disconnecting the master by disconnecting network cable it stops and throws an exception regarding the kahaDB (which is on GlusterFS) and needs to be restarted.
Is there a workaround for this behavior so the master broker auto-restarts or is able to connect automatically once the network comes back?
Upvotes: 0
Views: 929
Reputation: 1523
The failover depends on the time the underlying file system take for releasing the file lock.
In your case, the NFS cluster is waiting 50s to detect that the first node is lost and so release the lock on the kahadb file, wich can then be taken by the seconde node.
You can customize this delay with the NFSD_V4_GRACE
and NFSD_V4_LEASE
parameters in the NFS server configuration file (/etc/sysconfig/nfs
on redhat/centos systems).
You can also customize the kahadb lockKeepAlivePeriod, see http://activemq.apache.org/pluggable-storage-lockers.html
Upvotes: 1