Ankita
Ankita

Reputation: 2868

Cassandra materialized view partition key update performance

I am trying to update a column in base table which is a partition key in the materialized view and trying to understand its performance implications in a production environment.

Base Table:

CREATE TABLE if not exists data.test
 ( 
foreignid    uuid,

 id           uuid,
        
 kind         text,

 version      text,
           
 createdon    timestamp,
        
**certid**    text,

 PRIMARY KEY(foreignid,createdon,id)
    );

Materialized view:

CREATE MATERIALIZED VIEW if not exists data.test_by_certid 
AS
 SELECT *
FROM data.test
 WHERE id IS NOT NULL AND foreignid 
IS NOT NULL AND createdon IS NOT NULL AND certid IS NOT NULL 
PRIMARY KEY (**certid**, foreignid, createdon, id);

So, certid is the new partition key in our materialized view

What takes place :

1. When we first insert into the test table , usually the certids would
be empty which would be replaced by "none" string and inserted into
the test base table.

2.The row gets inserted into materialized view as well

3. When the user provides us with certid , the row gets updated in the test base table with the new certid

4.the action gets mirrored and the row is updated in materialized view wherein the partition key certid is getting updated from "none"
to a new value

Questions:

1.What is the perfomance implication of updating the partition key certid in the materialized view?

2.For my use case, is it better to create a new table with certid as partition key (insert only when certid in non-empty) and manually
maintain all CRUD operations to the new table or should I use MV and
let cassandra do the bookkeeping?

It is to be noted that performance is an important criteria since it will be used in a production environment.

Thanks

Upvotes: 0

Views: 791

Answers (1)

Duarte Nunes
Duarte Nunes

Reputation: 852

Updating a table for which one or more views exist is always more expensive then updating a table with no views, due to the overhead of performing a read-before-write and locking the partition to ensure concurrent updates play well with the read-before-write. You can read more about the internals of materialized views in Cassandra in ScyllaDb's wiki.

If changing the certid is a one-time operation, then the performance impact shouldn't be too much of a worry. Regardless, it is always a better idea to let Cassandra deal with updating the MV because it will take care of anomalies (such as what happens when the node storing the view is partitioned away and the update is unable to propagate), and eventually ensure consistency.

If you are worried about performance, consider replacing Cassandra with Scylla.

Upvotes: 7

Related Questions