Pankaj Goyal
Pankaj Goyal

Reputation: 1548

Disk Space not freed up even after deleting keyspace from cassandra db and compaction

I created a keyspace and a table(columnfamily) within it. Let's say "ks.cf"

After entering few hundred thousand rows in the columnfamily cf, I saw the disk usage using df -h.

Then, I dropped the keyspace using the command DROP KEYSPACE ks from cqlsh.

After dropping also, the disk usage remains the same. I also did nodetool compact, but no luck.

Can anyone help me out in configuring these things so that disk usage gets freed up after deleting the data/rows ?

Upvotes: 16

Views: 13130

Answers (5)

mirekphd
mirekphd

Reputation: 6791

The nodetool command can be used to clean up all unused (i.e. previously dropped) tables snapshots in one go (here issued inside a running bitnami/cassandra:4.0 docker container):

$ nodetool --username <redacted> --password <redacted> clearsnapshot --all
Requested clearing snapshot(s) for [all keyspaces] with [all snapshots]

Evidence: space used by old tables snapshots in the dicts keyspace:

a) before the cleanup:

$ sudo du -sch /home/<host_user>/cassandra_data/cassandra/data/data/<keyspace_name>/
134G    /home/<redacted>/cassandra_data/cassandra/data/data/dicts/
134G    total

b) after the cleanup:

$ sudo du -sch /home/<host_user>/cassandra_data/cassandra/data/data/<keyspace_name>/
4.0K    /home/<redacted>/cassandra_data/cassandra/data/data/dicts/
4.0K    total

Note: the accepted answer missed the --all switch (and the need to log in), but it still deserves to be upvoted.

Upvotes: 1

LetsNoSQL
LetsNoSQL

Reputation: 1538

Cassandra does not clear snapshots automatically when your are dropping a table or keyspace. if you enabled auto_snapshot in the cassandra.yaml then every time when you drop a table or keyspace Cassandra will capture a snapshot of that table. This snapshot will help you to rollback this table data if this was not done by mistake. If you do clear those table data from disk then you need to run below clearsnapshot command to free space.

nodetool -u XXXX -pw XXXXX clearsnapshot -t snapshotname

You can disable this auto_snapshot feature any time in cassandra.yaml.

Upvotes: 2

t&#246;rzsm&#243;kus
t&#246;rzsm&#243;kus

Reputation: 2002

nodetool cleanup removes all data from disk that is not needed there any more, i.e. data that the node is not responsible for. (clearsnapshot will clear all snapshots, that may be not what you want.)

Upvotes: 1

Highstead
Highstead

Reputation: 2441

Ran into this problem recently. After dropping a table a snapshot is made. This snapshot will allow you to roll this back if this was not intended. If you do however want that harddrive space back you need to run:

nodetool -h localhost -p 7199 clearsnapshot

on the appropriate nodes. Additionally you can turn snapshots off with auto_snapshot: false in your cassandra.yml.

edit: spelling/grammar

Upvotes: 19

Aki
Aki

Reputation: 3839

If you are just trying to delete rows, then you need to let the deletion step go through the usual delete cycle(delete_row->tombstone_creation->compaction_actually_deletes_the_row).

Now if you completely want to get rid of your keyspace, check your cassandra data folder(it should be specified in your yaml file). In my case it is "/mnt/cassandra/data/". In this folder there is a subfolder for each keyspace(i.e. ks ). You can just completely delete the folder related to your keyspace.

If you want to keep the folder around, it is good to know that cassandra creates a snapshot of your keyspace before dropping it. Basically a backup of all of your data. You can just go into 'ks' folder, and find the snapshots subdirectory. Go into the snapshots subdirectory and delete the snapshot related to your keyspace drop.

Upvotes: 7

Related Questions