Reputation: 105
I want to verify and test the 'replication_factor' and the consistency level ONE of Cassandra DB.
And I specified a Cluster: 'MyCluster01' with three nodes in two data center: DC1(node1, node2) in RAC1, DC2(node3) in RAC2.
Structure shown as below:
[root@localhost ~]# nodetool status
Datacenter: DC1
===============
Status=Up/Down |/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.0.0.62 409.11 KB 256 ? 59bf9a73-45cc-4f9b-a14a-a27de7b19246 RAC1
UN 10.0.0.61 408.93 KB 256 ? b0cdac31-ca73-452a-9cee-4ed9d9a20622 RAC1
Datacenter: DC2
===============
Status=Up/Down |/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.0.0.63 336.34 KB 256 ? 70537e0a-edff-4f48-b5db-44f623ec6066 RAC2
Then, I created a keyspace and table like following:
CREATE KEYSPACE my_check1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'};
create table replica_test(id uuid PRIMARY KEY);
After I inserted one record into that table:
insert into replica_test(id) values (uuid());
select * from replica_test;
id
--------------------------------------
5e6050f1-8075-4bc9-a072-5ef24d5391e5
I got that record.
But when I stopped node1 and queried again in either node 2 and node 3, none of the query succeeded.
select * from replica_test;
Traceback (most recent call last): File "/usr/bin/cqlsh", line 997,
in perform_simple_statement
rows = self.session.execute(statement, trace=self.tracing_enabled) File
"/usr/share/cassandra/lib/cassandra-driver-internal-only-2.1.3.post.zip/cassandra-driver-2.1.3.post/cassandra/cluster.py",
line 1337, in execute
result = future.result(timeout) File "/usr/share/cassandra/lib/cassandra-driver-internal-only-2.1.3.post.zip/cassandra-driver-2.1.3.post/cassandra/cluster.py",
line 2861, in result
raise self._final_exception Unavailable: code=1000 [Unavailable exception] message="Cannot achieve consistency level ONE"
info={'required_replicas': 1, 'alive_replicas': 0, 'consistency':
'ONE'}
While the 'nodetool status' command returned:
UN 10.0.0.62 409.11 KB 256 ? 59bf9a73-45cc-4f9b-a14a-a27de7b19246 RAC1
DN 10.0.0.61 408.93 KB 256 ? b0cdac31-ca73-452a-9cee-4ed9d9a20622 RAC1
UN 10.0.0.63 336.34 KB 256 ? 70537e0a-edff-4f48-b5db-44f623ec6066 RAC2
And when I tried stopping node 2, keeping node 1 and 3 alive; or stopping node 3, keeping node 1 and 2 alive; The error occurred as well.
Then what 's the problem, since I think I 've already satisfied the consistency level, and where exactly does this record exists?
Upvotes: 4
Views: 1960
Reputation: 57748
What ultimately does 'replication_factor' controls?
To directly answer the question, replication factor (RF) controls the number of replicas of each data partition that exist in a cluster or data center (DC). In your case, you have 3 nodes and a RF of 1. That means that when a row is written to your cluster, that it is only stored on 1 node. This also means that your cluster cannot withstand the failure of a single node.
In contrast, consider a RF of 3 on a 3 node cluster. Such a cluster could withstand the failure of 1 or 2 nodes, and still be able to support queries for all of its data.
With all of your nodes up and running, try this command:
nodetool getendpoints my_check1 replica_test 5e6050f1-8075-4bc9-a072-5ef24d5391e5
That will tell you on which node the data for key 5e6050f1-8075-4bc9-a072-5ef24d5391e5
resides. My first thought, is that you are dropping the only node which has this key, and then trying to query it.
My second thought echoes what Carlo said in his answer. You are using 2 DCs, which is really not supported with the SimpleStrategy
. Using SimpleStrategy
with multiple DCs could produce unpredictable results. Also with multiple DCs, you need to be using the NetworkTopologyStrategy
and something other than the default SimpleSnitch
. Otherwise Cassandra may fail to find the proper node to complete an operation.
First of all, re-create your keyspace and table with the NetworkTopologyStrategy
. Then change your snitch (in the cassandra.yaml
) to a network-aware snitch, restart your nodes, and try this exercise again.
Upvotes: 4
Reputation: 20021
NetworkTopologyStrategy
should be used when replicating accross multiple DC.
Upvotes: 1