Reputation:
I want to understand that how Cassandra ensures high availability. What I know is when we query Cassandra database for data, a node called coordinator route the query to the appropriate Cassandra node in the cluster with the required data. But what if the node which we specify in JDBC connection URL (Which I think will act as coordinator in the cluster, please make me correct if I am wrong) itself down? In this case how come Cassandra ensures high availability?
Perhaps we as a developer must provide fallback mechanism for that?
Upvotes: 1
Views: 4675
Reputation: 132972
In a Cassandra cluster all nodes are equal. There are no masters or coordinators at the cluster level. When you connect to a cluster you usually specify one or more nodes to connect to, but once the driver has connected it can find out about the other nodes. This means that if the first node it connected to goes down, it knows about the other nodes and can connect to one of them instead.
If a query is sent to a node which itself does not host the requested data (or a consistency level higher than one is specified), that node acts as a coordinator for the query, but that is a temporary role, and any node can take that role for any query.
There are even drivers, such as Astyanax, that connect to multiple nodes and try to figure out which node contains the requested data and use the connection to that node to do the query, in order to minimize the query time.
Upvotes: 4