Reputation: 1098
Suppose I have [3 nodes - 1 datacenter - 1 cluster] cassandra setup.
A keysapce with replication factor = 2
I am taking regular snapshots and incremental backups for all nodes.
One of my 3 node goes completely down with whatever reason and I want to restore backup.
Cassandra(datastax) documentation suggests to usually TRUNCATE table before restoring.
Question: As I am only going to restore backup on one node, is TRUNCATE necessary? Because truncate will delete that table's data from ALL nodes as per my understanding. TRUNCATE Doc
So if I truncate table and restore backup only on one node, then wouldn't I loose data for that table which was stored on other nodes too?
Upvotes: 0
Views: 563
Reputation: 800
First of all, in your scenario, you might not want to restore a backup at all. Since you have replication factor = 2, your data is still on one other node of the original three. Therefore, you could remove the node that went completely down and add it again. Cassandra will automatically get it up to speed and stream the data to it.
Alternatively or complementary, you can stream the data files from the backup into your cluster with SSTableLoader.
A few other points though for the sake of knowledge:
Why Truncate?
Truncate is recommended in certain scenarios because the data you restore will have older timestamps than the new data.
The example in the link you sent is rather apt to explain one of those scenarios. If you accidently delete a lot of data and wish to restore your old data, you will need to remove the tombstones which mark those rows as deleted by truncating first.
Upvotes: 0