Reputation: 7
My project requirement goes like this. We use a multi-data center (DC) cassandra cluster. During write on to the cluster, I want only LOCAL DC to perform writes on its nodes as we are already routing the write requests to the desired DC only based on the source from where write is initiated. So, I want only LOCAL DC to process the write and no other DC to perform the writes on its nodes. But later on by virtue of replication among nodes across DCs, I want the written data to be replicated across DCs. Is this replication across DCs possible when I am restricting the write to only one DC in the first place. I thought of using DCAwareRoundRobin policy for this. If I do not open connections to REMOTE hosts lying in different DCs during my write operation, is data replication possible amongst DCs later on. Why I definitely need replicas of data in all DCs is because during data read from cluster, we want the data to be read from any DC the read request falls on, not necessarily the LOCAL one.
Upvotes: 1
Views: 1192
Reputation: 2253
Yes, it's definitely possible. Consistency levels with prefix "LOCAL_" allow to write data to single data center and asynchronously replicate it to other. But in this case only eventual consistency is guaranteed, that means that the written data doesn't appear immediately in other data centers because of asynchronous replication. If you need strong consistency between data centers that you should use consistency levels with prefix "EACH", but it's significantly affects latency.
If you need strong consistency for write request in single data center the following rule should be satisfied (nodes_written + nodes_read) > number_of_replicas, for example if you have 3 replicas in each data center, both read and write requests should be performed with LOCAL_QUORUM consistency level(2 + 2 > 3). If eventual consistency is sufficient LOCAL_ONE can be used.
There is similar case in "Geographical Location Scenario" section.
Upvotes: 3