Sam
Sam

Reputation: 446

Cassandra snapshot folder size is too high

The folder size of one of the sstable after taking snapshot is 1TB

$ du -sh * 
1001 GB    user-820d7e50c85111eab874f3e361ecc166

Surprisingly, size of the cassandra snapshot folder in the sstable folder was 785G (snp-2021-04-11-0400-01) and once I deleted the snapshot folder, the size of sstable folder dropped to 281 GB

-bash-4.2$ du -sh *
281G    user-820d7e50c85111eab874f3e361ecc166

My question is why size of snapshot folder is more than twice of data folder? is it normal in Cassandra?

My assumption was Cassandra creates a copy of sstables to the snapshot folder with the same size.

Upvotes: 2

Views: 496

Answers (1)

Alex Ott
Alex Ott

Reputation: 87069

Cassandra doesn't copy the SSTables, but really creating a hard link (just another name) from original SSTable into the snapshots folder. But when compaction happens, original SSTable is deleted, but it's kept on the disk because it has another name. And if you're doing snapshots often, and compaction happens often too, then you'll have a lot of links to old SSTables.

The solution is to periodically cleanup snapshots - you can use nodetool clearsnapshot command to delete selected snapshots (old backups for example)

Upvotes: 1

Related Questions