Reputation: 34900
For some reasons I need to query a particular datacenter within my cassandra cluster. According to the documentation, I can use the LOCAL_QUORUM
consistency level:
Returns the record after a quorum of replicas in the current datacenter as the coordinator has reported. Avoids latency of inter-datacenter communication.
Do I correctly understand, that in order to specify a particular datacenter for the current query, I have to build a cluster on the given endpoint belonging to this particular DC?
Say, I have two DC's with the following nodes:
DC1: 172.0.1.1, 172.0.1.2
DC1: 172.0.2.1, 172.0.2.2
So, to work with DC1, I build a cluster as:
Cluster cluster = Cluster.builder().addContactPoint("172.0.1.1").build();
Session session = cluster.connect();
Statement statement = session.prepare("select * from ...").bind().setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM);
ResultSet resultSet = session.execute(session);
Is it a proper way to do that?
Upvotes: 4
Views: 2313
Reputation: 57748
By itself, DCAwwareRoundRobinPolicy
will pick the data center that it finds with the "least network distance" algorithm. To ensure it connects where you want, you should specify the DC as a parameter.
Here is how I tell our dev teams to do it:
Builder builder = Cluster.builder()
.addContactPoints(nodes)
.withQueryOptions(new QueryOptions()
.setConsistencyLevel(ConsistencyLevel.LOCAL_ONE))
.withLoadBalancingPolicy(new TokenAwarePolicy(
new DCAwareRoundRobinPolicy.Builder()
.withLocalDc("DC1").build()))
.withPoolingOptions(options);
Note: this may or may not be applicable to your situation, but do I recommend using the TokenAwarePolicy
with the DCAwareRoundRobin
nested inside it (specifying the local DC). That way any operation specifying the partition key will automatically route to the correct node, skipping the need for an extra hop required with a coordinator node.
Upvotes: 4
Reputation: 34900
According to the Cluster
class documentation:
A cluster object maintains a permanent connection to one of the cluster nodes which it uses solely to maintain information on the state and current topology of the cluster
Also, because a default load balancing policy is DCAwareRoundRobinPolicy
this approach should work fine as expected.
Upvotes: 0