user2567697
user2567697

Reputation: 91

best way to run nodetool upgradesstables after update?

I'm currently in the process of upgrading a 21 node cluster from 0.8 to version 1.0.11. The cassandra upgrade process requires that sstables are updated to the latest format after an upgrade of the software (via nodetool upgradesstables). This process seems to take a very long time. I have one node that's been running it for 48 hours and still isn't done.

I would like to know if it's advisable to do this in parallel on all the nodes. Specifically, what would be the performance implications? This cluster is under fairly heavy r/w use and needs to be available 24/7.

Upvotes: 9

Views: 5411

Answers (2)

Gary
Gary

Reputation: 11

I run the upgrade simultaneously across all nodes. I run the command (on Linux)

nohup nodetool upgradesstables &

and then logout and leave it running. It's a low priority task, and it will take as long as it needs to take in order to rewrite all sstables that need rewriting. I haven't noticed any latency issues while the upgrade is running.

If, for example, you have 1TB of data per node (naughty!), then the upgrade needs to rewrite all 1TB of data across multiple files. Reading the writing this much data at the slow rate it runs can take several days.

note: as sstables are immutable, and since backups are taken by creating a hardlink to an sstable file, then as the upgrade process is working you will double the amount of disk space used. So watch your disk space and delete snapshots if needed to free up space, especially if your nodes are using more than 50% diskspace for data.

Upvotes: 1

Florent
Florent

Reputation: 1399

During compaction, your nodes will be re-writing every sstable at the speed of "compaction_throughput_mb_per_sec".

My guess is that the performance implications are directly linked to the value of this setting. A low value (default is 16Mb, you can go lower) should allow you to upgrade your cluster without slowing it down.

Upvotes: 7

Related Questions