Reputation: 449
In the open source version, Scylla recommends keeping up to 50% of disk space free for “compactions”. At the same time, the documentation states that each table is compacted independently of each other. Logically, this suggests that in a applications with dozens (or even multiple) tables there’s only a small chance that so many compaction will coincide.
Is there a mathematical model of calculating how multiple compaction might overlap in an application with several tables? Based on a cursory analysis, it seems that the likelihood of multiple overlapping compaction is small, especially when we are dealing with dozens of independent tables.
Upvotes: 3
Views: 77
Reputation: 13771
You're absolutely right:
With the size-tiered compaction strategy a compaction may temporarily double the disk requirements. But it doesn't double the entire disk requirements but only of the sstables involved in this compaction (see also my blog post on size-tiered compaction and its space amplification). There is indeed a difference between "the entire disk usage" and just "the sstables involved in this compaction" for two reasons:
The second reason cannot be counted on - since shards choose when to compact independently, if you're unlucky all shards may decide to compact the same table at exactly the same time, and worse - may happen to do the biggest compactions all at the same time. This "unluckiness" can also happen at 100% probability if you start a "major compaction" (nodetool compact).
The first reason, the one which you asked about, is indeed more useful and reliable: Beyond it being unlikely that all shards will choose to compact all sstables are exactly the same time, there is an important detail in Scylla's compaction algorithm which helps here: Each shard only does one compaction of a (roughly) given size at a time. So if you have many roughly-equal-sized tables, no shard can be doing full compaction of more than one of those tables at a time. This is guaranteed - it's not a matter of probability.
Of course, this "trick" only helps if you really have many roughly-equal-sized tables. If one table is much bigger than the rest, or tables have very different sizes, it won't help you too much to control the maximum temporary disk use.
In issue https://github.com/scylladb/scylla/issues/2871 I proposed a idea of how Scylla can guarantee that when disk space is low, the sharding (point 1) is also used to reduce temporary disk space usage. We haven't implemented this idea, but instead implemented a better idea - "incremental compaction strategy", which does huge compactions in pieces ("incrementally") to avoid most of the temporary disk usage. See this blog post for how this new compaction strategy works, and graphs demonstrating how it lowers the temporary disk usage. Note that Incremental Compaction Strategy is currently part of the Scylla Enterprise version (it's not in the open-source version).
Upvotes: 5