How failure detection and recovery mechanism in cassandra works?

Question

To all Cassandra experts,

I am trying to understand cassandra failure detection and recovery. I am a little bit confused on how this exactly works.

From Datastax Doc:

Configuring the phi_convict_threshold property adjusts the sensitivity of the failure detector. Lower values increase the likelihood that an unresponsive node will be marked as down, while higher values decrease the likelihood that transient failures causing node failure. In unstable network environments (such as EC2 at times), raising the value to 10 or 12 helps prevent false failures.

From http://ljungblad.nu/post/44006928392/cassandra-and-its-accrual-failure-detector

Phi represents the likelihood that Node A is wrong about Node B’s state.The higher the Phi, the bigger the confidence that Node B has failed.

Can someone explain me in details C* failure detection mechanism and how C* recovers it in different scenarios.

Thanks in advance

Chaity

How failure detection and recovery mechanism in cassandra works?

Answers (1)

Related Questions