Reputation: 1548
I created a keyspace and a table(columnfamily) within it.
Let's say "ks.cf"
After entering few hundred thousand rows in the columnfamily cf
, I saw the disk usage using df -h
.
Then, I dropped the keyspace using the command DROP KEYSPACE ks
from cqlsh
.
After dropping also, the disk usage remains the same
. I also did nodetool compact
, but no luck.
Can anyone help me out in configuring these things so that disk usage gets freed up after deleting the data/rows ?
Upvotes: 16
Views: 13130
Reputation: 6791
The nodetool
command can be used to clean up all unused (i.e. previously dropped) tables snapshots in one go (here issued inside a running bitnami/cassandra:4.0
docker container):
$ nodetool --username <redacted> --password <redacted> clearsnapshot --all
Requested clearing snapshot(s) for [all keyspaces] with [all snapshots]
Evidence: space used by old tables snapshots in the dicts
keyspace:
a) before the cleanup:
$ sudo du -sch /home/<host_user>/cassandra_data/cassandra/data/data/<keyspace_name>/
134G /home/<redacted>/cassandra_data/cassandra/data/data/dicts/
134G total
b) after the cleanup:
$ sudo du -sch /home/<host_user>/cassandra_data/cassandra/data/data/<keyspace_name>/
4.0K /home/<redacted>/cassandra_data/cassandra/data/data/dicts/
4.0K total
Note: the accepted answer missed the --all
switch (and the need to log in), but it still deserves to be upvoted.
Upvotes: 1
Reputation: 1538
Cassandra does not clear snapshots automatically when your are dropping a table or keyspace. if you enabled auto_snapshot in the cassandra.yaml then every time when you drop a table or keyspace Cassandra will capture a snapshot of that table. This snapshot will help you to rollback this table data if this was not done by mistake. If you do clear those table data from disk then you need to run below clearsnapshot command to free space.
nodetool -u XXXX -pw XXXXX clearsnapshot -t snapshotname
You can disable this auto_snapshot feature any time in cassandra.yaml.
Upvotes: 2
Reputation: 2002
nodetool cleanup
removes all data from disk that is not needed there any more, i.e. data that the node is not responsible for. (clearsnapshot
will clear all snapshots, that may be not what you want.)
Upvotes: 1
Reputation: 2441
Ran into this problem recently. After dropping a table a snapshot is made. This snapshot will allow you to roll this back if this was not intended. If you do however want that harddrive space back you need to run:
nodetool -h localhost -p 7199 clearsnapshot
on the appropriate nodes. Additionally you can turn snapshots off with auto_snapshot: false
in your cassandra.yml.
edit: spelling/grammar
Upvotes: 19
Reputation: 3839
If you are just trying to delete rows, then you need to let the deletion step go through the usual delete cycle(delete_row->tombstone_creation->compaction_actually_deletes_the_row).
Now if you completely want to get rid of your keyspace, check your cassandra data folder(it should be specified in your yaml file). In my case it is "/mnt/cassandra/data/". In this folder there is a subfolder for each keyspace(i.e. ks ). You can just completely delete the folder related to your keyspace.
If you want to keep the folder around, it is good to know that cassandra creates a snapshot of your keyspace before dropping it. Basically a backup of all of your data. You can just go into 'ks' folder, and find the snapshots subdirectory. Go into the snapshots subdirectory and delete the snapshot related to your keyspace drop.
Upvotes: 7