Mac
Mac

Reputation: 517

Does cassandra upgrade require to run nodetool upgradesstables for cluster holding TTLed data

I am running 3 node apache cassandra cluster as docker container holding timeseries data with 45 days TTL.

I am planning to upgrade the current cassandra version 2.2.5 to cassandra 3.11.4 release. Following steps are identified for the upgrade -

  1. Backup existing data
  2. Flush one of the cassandra node

    bin/nodetool -h cassandra1 -u ca_itoa -pw ca_itoa drain

  3. Stop the cassandra1 node

  4. Start the new cassandra 3.11.4 container

  5. Upgrade the SSTable

    bin/nodetool -u ca_itoa -pw ca_itoa upgradesstables

  6. Check the node status. Repeat the process for the rest of the nodes

I have few questions about the upgrade process -

  1. Are the steps correct?
  2. Is it manodatory to run upgradesstables command. It is time consuming, and I want to see if I can avoid. The data has TTL set. Will the cassandra continue writing in new SSTable format whereas the old SSTable data get cleaned-up on expiring? Assumption is that, after 45 days, all SSTable would be in new shiny format.

Upvotes: 3

Views: 2524

Answers (2)

Aaron
Aaron

Reputation: 57748

Just some additional thoughts:

For Step #6, you actually don't have to run upgradesstables right away. In fact, if you're upgrading a production system, it's probably better that you don't until the application team verifies that they can connect ok. Remember, older versions of the driver which work in 2.2 may not work with 3.11.4.

To this end, I would wait until the entire cluster is running on the new version before running upgradesstables on each node.

Is it manodatory to run upgradesstables command?

As each Cassandra version is capable of reading its own SSTable format as well as the prior major version, I guess it's not mandatory. But it's definitely something that you should want to do. Especially when upgrading to 3.x.

Cassandra 3 contains a significant upgrade to the storage engine, which results is a much smaller disk footprint. One cluster I upgraded saw a 90% reduction in disk needs.

Plus, you'd be incurring additional latency when reading records which may be spread across the old SSTable files as well as the new. Reads for records across multiple files are bad enough as it is. But now you'd be forcing Cassandra to read and collate results from two formats.

So while I wouldn't say it's "mandatory," I'd definitely say it qualifies as a "good idea."

Upvotes: 3

LetsNoSQL
LetsNoSQL

Reputation: 1538

Yes, you need to run nodetool sstableupgrade on each node after cassandra upgrade as you are upgrading from 2.2.x to 3.11.4. sstable file format and ext also will change. You may run this process on background and it will not create any issue. please refer below links for more details https://blog.thethings.io/upgrading-apache-cassandra-cluster/

Upvotes: 2

Related Questions