Reputation: 446
The folder size of one of the sstable after taking snapshot is 1TB
$ du -sh *
1001 GB user-820d7e50c85111eab874f3e361ecc166
Surprisingly, size of the cassandra snapshot folder in the sstable folder was 785G (snp-2021-04-11-0400-01) and once I deleted the snapshot folder, the size of sstable folder dropped to 281 GB
-bash-4.2$ du -sh *
281G user-820d7e50c85111eab874f3e361ecc166
My question is why size of snapshot folder is more than twice of data folder? is it normal in Cassandra?
My assumption was Cassandra creates a copy of sstables to the snapshot folder with the same size.
Upvotes: 2
Views: 496
Reputation: 87069
Cassandra doesn't copy the SSTables, but really creating a hard link (just another name) from original SSTable into the snapshots folder. But when compaction happens, original SSTable is deleted, but it's kept on the disk because it has another name. And if you're doing snapshots often, and compaction happens often too, then you'll have a lot of links to old SSTables.
The solution is to periodically cleanup snapshots - you can use nodetool clearsnapshot command to delete selected snapshots (old backups for example)
Upvotes: 1