Reputation: 33
We are using a 3 node cluster with REPLICATION = {'class':'SimpleStrategy' , 'replication_factor':1 }
But when we are inserting data , the same row is present in all three nodes (I see it when I run it on each node individually)
When I run nodetool status (I see the below) :
-- Address Load Tokens Owns (effective) Host ID Rack
UN 172.31.46.89 6.43 MiB 256 32.8% 2db6dc5c-9d05-4dc7-9bf5-ea9e3c406267 rack1
UN 172.31.47.150 13.17 MiB 256 32.1% eb10cc48-6117-427c-9151-48cb6761a5e6 rack1
DN 172.31.45.131 12.73 MiB 256 35.1% cc33fc04-a02f-41e2-a00b-3835a0d98cb5 rack1
Can anyone help me to understand why data is present in all nodes???
Upvotes: 0
Views: 133
Reputation: 458
Data will not be stored on all nodes when RF=1. Instead when you connect with any node it act as a coordinator node and fetch data from node responsible for the data and provides the response.
The coordinator only stores data locally (on a write) if it ends up being one of the nodes responsible for the data's token range.
Upvotes: 0
Reputation: 16430
Cassandra is masterless and when you make a query to any node in the cluster it will request the appropriate replica to answer your query. The data will not be stored on all nodes with RF=1. If really want to verify it look at your data/keyspace/table
directory and use the sstabledump on the Data file.
Upvotes: 1