Reputation: 607
I have two cassandra table, a record table and a counter table. The counter table keeps a counter for each kind of record in the record table.
When I insert a new record into the record table, I will update the counter table at the same time. But it's possible that the new record is already in the record table. It's Ok to insert the same record twice, but then I would add the counter twice, which is not correct.
I have two solutions now.
Fetch a record from cassandra with the new record key. If it's not null, I will not insert the record and increase the counter.
Use light weight transaction to let cassandra check if the record already exists.
Solution 2 will make the insert "atomic", but the doc says it will have performance penalties. In Solution 1, I'm sending 2 queries, this will also have performance penalties.
Currently I'm using solution 1. I'm new to cassandra light weight transaction, so I don't know the cost of atomicity. Does anyone know which solution is better?
Upvotes: 3
Views: 2257
Reputation: 4067
Basically you have a few options:
One day I was running a simple test against a 3x Cassandra cluster of m3.large instances (https://aws.amazon.com/ec2/instance-types/) There were 100 partitions and 100 inserts into each partition (so total 10k inserts) in a single thread - so this is not an IO-saturating test.
The schema:
CREATE TABLE IF NOT EXISTS parent_children (
parentId uuid,
childId uuid,
PRIMARY KEY (parentId, childId)
);
CREATE TABLE IF NOT EXISTS child_counters (
parentId uuid,
count counter,
PRIMARY KEY (parentId)
);
The results:
Insertion Method Latency per insert, ms
TRUSTED UNIQUE 1.6404
IF NOT EXISTS 4.2801
READ WRITE ONE 3.9382
READ WRITE QUORUM 3.7714
Note that quorum was unexpectedly little faster but that probably was within an error margin and/or may be due to specifics of the cluster topology.
Upvotes: 7