nmakb
nmakb

Reputation: 1235

What is the suggested procedure for dropping a keyspace with 55TB of data on 66-node cluster?

There is a keyspace occupying 55tb of space, cluster has 66 nodes in it (dse 5.1.6, cassandra 3.11) The keyspace has 3 tables in it and there are no reads/writes on the tables since last one month.

Want to drop the keyspace/tables to reclaim space without causing any issues in the cluster?

  1. When dropping tables and keyspace on this cluster - what might cause issues? the size of the unused keyspace (55tb) or the number of nodes (66) in the cluster to which the schema change (drop tables/keyspace) would would need to be propagated?
  2. Other than dropping tables and then keyspace is there any other way to safely drop the keyspace? For example would dropping the sstables from nodes make drops quick and smoother? Would dropping sstables trigger repairs/compactions and cause any issues?
  3. Is there any way to disable auto_snapshot at session level or from a driver level for specific tables or keyspace?
  4. Any considerations before/after dropping the tables/keyspace? here are the steps I am going to follow - a. nodetool describecluster b. drop tables using cqlsh (with request-timeout=600) c. drop keyspace using cqlsh (with request-timeout=600) d. nodetool describecluster, check for any inconsistencies e. From each node delete the data directory for the keyspace (data is already backed up somewhere, there is no need of autosnapshot)

Upvotes: 1

Views: 541

Answers (1)

Erick Ramirez
Erick Ramirez

Reputation: 16323

The only real issue I can foresee you would run into is a schema disagreement. Due to (a) size of the data and (b) number of nodes, my approach would be:

  1. Attempt to TRUNCATE 1 table at a time
  2. If the TRUNCATE times out, attempt a second time
  3. When a table is truncated, DROP it
  4. Wait at least 1 minute for schema to propagate
  5. Check for schema disagreement and fix as appropriate
  6. Repeat the steps above until all tables are dropped
  7. DROP the keyspace
  8. Manually delete snapshots from filesystem as necessary

To answer your questions directly:

  1. You can run into timeouts and schema disagreement. And yes, it's a combination of (a) data size and (b) number of nodes.
  2. I'd recommend truncating the tables first as above. This wouldn't result in a schema disagreement since it doesn't change the schema. By truncating first, it would allow the DROP to work without issues.
  3. No, you can only disable auto_snapshot in cassandra.yaml which requires a rolling restart. You don't want to do that because it's not necessary to restart all 66 nodes.
  4. I've posted the procedure above.

Upvotes: 2

Related Questions