pdxleif
pdxleif

Reputation: 1800

Cassandra DELETE using multi-partition BATCH vs IN operator

We have a table where we need to process a group of deletes atomically (spread across partition keys). Would it be preferable to do with with a LOGGED BATCH of separate DELETE statements, or one DELETE with the keys to be deleted given in a WHERE ... IN (...) clause?

Upvotes: 1

Views: 160

Answers (1)

clunven
clunven

Reputation: 1695

None of those solutions is optimal (BATCH, IN), depending on your queries the load is moved from your clients to the coordinator nodes. You want to use parallel queries at client side if possible to distribute the load across nodes.

Now, some insights for question regarding BATCH and IN:

  • In both case you want to limit the number of statements per group (~20 ish)

  • If you want to group deletes, try to delete the largest block like one-go partition (to create partition-level tombstones).

  • Logged batch will retry failed statements until batch timeout, it could create more load at the coordinator nodes than IN. The atomicity is only ensure at partition level, if you use cross partitions delete use IN.

Upvotes: 1

Related Questions