Little Helper
Little Helper

Reputation: 2026

Fixing Cassandra Database

My co-worker and I have been thrown into a project that uses Cassandra with no introductions. Alright, let's do this!

SELECT * FROM reports WHERE timestamp < '2019-01-01 00:00:00' ALLOW FILTERING;

Error: 1300

Apparently, we have too many tombstones. What's that? A tombstone is deleted data that hasn't been removed yet for performance reasons. Tombstones should be deleted with nodetool repair before the gc_grace_period has expired, default is 10 days. Now, this project is around 7 years old and it doesn't seem like there's a job that runs repair. According to default warning and error values, 1K tombstones are a lot. We find about 1.4M. We measured the number of tombstones with Tracing on, running a SELECT query, and accumulating the tombstones reported.

We tried to run nodetool repair --full -pr -j 4 but we get Validation failed in /10.0.3.1. DataStax's guide to repairing repairs wants us to fix the validation error with nodetool scrub. But we still get the same error afterwards. The guide then wants us to run sstablescrub, which failed with an out-of-memory exception.

Going back to our original problem of deleting data before 2019, we tried to run DELETE FROM reports WHERE timestamp < '2019-01-01 00:00:00'. However, timestamp is not our partition key so we are not allowed to delete data like this, which has also been confirmed by many other StackOverflow posts and an DataStax issue on Jira. Every post mentions that we should "just" change the schema of our Cassandra database to fit our queries. First, we only need to do this once; second, our client wants to have this data deleted as soon as possible.

  1. Is there a way of easily changing the schema of a Cassandra database?
  2. Is there a way that we can make a slow solution that at least works?

All in all, we are new to Cassandra and we are unsure on how to proceed.

What we want is

  1. delete all data from before 2019 and confirm that it is deleted
  2. have stable selects, avoiding error 1300

Can you help?

We have 4 nodes running in Docker on Azure if that is necessary to know. The version of Cassandra is 3.11.6.

Upvotes: 1

Views: 279

Answers (1)

Alex Ott
Alex Ott

Reputation: 87069

Tombstones could exist in the SSTables longer than 10 days because they are evicted during compaction, and if it didn't happen for a long time, then they just stay there. You have following options available (for 3.11.x):

  • if you have disk space you may force compaction using the nodetool compact -s that will combine all SSTables into several SSTables - this will put a lot of load onto the system as it will read all data & write them back
  • use nodetool garbagecollect to evict old data & expired tombstones - but it may not delete all tombstones
  • you can tune parameters of the specific table so compaction will happen more often, like, decrease the minimal number of SSTables for compaction from 4 to 2, plus some other options (min_threshold, tombstone_threshold, etc.)

In future, for repairs it's recommended to use something like Reaper, that performs token range repair, putting less load onto the system.

Mass deletion of data could be done by external tools, for example:

  • Spark + Spark Cassandra Connector - see this answer for example
  • DSBulk - you can use the -query option to specify your query to unload data to disk (only columns of the primary key, and use :start/:end keywords), and then loading data providing the -query 'DELETE FROM table WHERE primary_key = ....'

And for schema change - it's not the most trivial task. To match your table structure to queries you most probably will need to change the primary key, and in Cassandra this is is done only via creation of the new table(s), and loading data into these new tables. For that task you'll also need something like Spark or DSBulk, especially if you'll need to migrate data with TTL and/or WriteTime. See this answer for more details.

Upvotes: 2

Related Questions