Reputation: 2868
I am trying to update a column in base table which is a partition key in the materialized view and trying to understand its performance implications in a production environment.
Base Table:
CREATE TABLE if not exists data.test
(
foreignid uuid,
id uuid,
kind text,
version text,
createdon timestamp,
**certid** text,
PRIMARY KEY(foreignid,createdon,id)
);
Materialized view:
CREATE MATERIALIZED VIEW if not exists data.test_by_certid
AS
SELECT *
FROM data.test
WHERE id IS NOT NULL AND foreignid
IS NOT NULL AND createdon IS NOT NULL AND certid IS NOT NULL
PRIMARY KEY (**certid**, foreignid, createdon, id);
So, certid is the new partition key in our materialized view
What takes place :
1. When we first insert into the test table , usually the certids would
be empty which would be replaced by "none" string and inserted into
the test base table.
2.The row gets inserted into materialized view as well
3. When the user provides us with certid , the row gets updated in the test base table with the new certid
4.the action gets mirrored and the row is updated in materialized view wherein the partition key certid is getting updated from "none"
to a new value
Questions:
1.What is the perfomance implication of updating the partition key certid in the materialized view?
2.For my use case, is it better to create a new table with certid as partition key (insert only when certid in non-empty) and manually
maintain all CRUD operations to the new table or should I use MV and
let cassandra do the bookkeeping?
It is to be noted that performance is an important criteria since it will be used in a production environment.
Thanks
Upvotes: 0
Views: 791
Reputation: 852
Updating a table for which one or more views exist is always more expensive then updating a table with no views, due to the overhead of performing a read-before-write and locking the partition to ensure concurrent updates play well with the read-before-write. You can read more about the internals of materialized views in Cassandra in ScyllaDb's wiki.
If changing the certid
is a one-time operation, then the performance impact shouldn't be too much of a worry. Regardless, it is always a better idea to let Cassandra deal with updating the MV because it will take care of anomalies (such as what happens when the node storing the view is partitioned away and the update is unable to propagate), and eventually ensure consistency.
If you are worried about performance, consider replacing Cassandra with Scylla.
Upvotes: 7