cassandra sync with recovered node

I am trying to build cassandra backup and recovery process.

Let say I have 2 nodes A and B and table C with replica factor 2. In table C we have row with ID=5 and Name="Alex". Now , something bad happened to node B and we need to get it down for the few minutes to make a restore. In the same time,while node B is down, someone change row with ID=5 form Name="Alex" to Name="Alehandro".

Node B up again , with restored data and respectively for this node row with ID=5 still contain Name="Alex".

What will happens when I try to find row with ID=5? Will node A synchronize with node B?

Thanks.

Upvotes: 0

Views: 216

Answers (1)

Alex Ott
Alex Ott

Reputation: 87069

Cassandra has several ways to synchronize data to nodes that were missed writes because they were down, or there was garbage collection pause, etc. This includes:

  • Hints - coordinator node for a some time (3 hours by default, configurable) will collect all write operations that other node has missed, and when it's back - these operations will be replayed against it
  • Repair - explicit synchronization of data, that is triggered via nodetool repair manually, or the tools like Reaper could be used to automate it
  • Read repair - if you're using consistency level that requires reading from the several nodes (TWO, LOCAL_QUORUM, QUORUM, etc.), then coordinator node will detect discrepancies, and will return data with the newest timestamp, if necessary, fixing the data on node that has old data

Answering your last question - when 2nd node is back, you can get old data if hints aren't replayed yet, and you're reading directly from that node, and you're reading with consistency level ONE or LOCAL_ONE.

P.S. I recommend to look through the DSE Architecture Guide - it covers how Cassandra works.

Upvotes: 1

Related Questions