Reputation: 1462
I have a performance question. I got two tables in Cassandra, both of them have exactly same structure. I need to save incoming data in both of them. The problem I have is what would be better solution to do this:
I think first two are ok, but I am not sure if first one is good enough. Can someone explain it to me?
Upvotes: 1
Views: 71
Reputation: 16393
+1 to Aaron's response about using BATCH
statements but the quoted example is specific to CQL. It's a bit more nuanced when implementing it in your app.
If you're using the Java driver, a typical INSERT
statement would look like:
SimpleStatement simpleInsertUser = SimpleStatement.newInstance(
"INSERT INTO users (...) VALUES (?), "..." );
And here's a prepared statement:
PreparedStatement psInsertUserByMobile = session.prepare(
"INSERT INTO users_by_mobile (...) VALUES (...)" );
If you were to batch these 2 statements:
BatchStatement batch = BatchStatement.newInstance(
DefaultBatchType.LOGGED,
simpleInsertBalance,
preparedInsertExpense.bind(..., false) );
session.execute(batch);
For item 2 in your list, I don't know companies who use Cassandra TRIGGERs in production so it isn't something I would recommend. It was experimental for a while and I don't have enough experience to be able to recommend them for production.
For item 3, this is the use case that Materialized Views are trying to solve. They are certainly a lot simpler from a dev point-of-view since the table updates are done server-side instead of client-side.
They are OK to use if you don't have a lot of tables but be aware that the updates to the views happen asynchronously (not at the same time as when the mutations occur on the base table). With MVs, there's also a risk that when the views get so out-of-sync with the base table, the only solution is to drop and recreate the MV.
If you prefer not to use BATCH statements, just make sure you're fully aware of the tradeoffs with using MVs. If you're interested, I've explained it in a bit more detail in https://community.datastax.com/articles/2774/. Cheers!
Upvotes: 0
Reputation: 57808
This sounds like a good use case for BATCH
. Essentially, you can assemble two write statements and execute them in a BATCH
to ensure atomicity. That should keep the two tables in-sync. Example below from the DataStax docs (URL).
cqlsh> BEGIN LOGGED BATCH
INSERT INTO cycling.cyclist_names (cyclist_name, race_id) VALUES ('Vera ADRIAN', 100);
INSERT INTO cycling.cyclist_by_id (race_id, cyclist_name) VALUES (100, 'Vera ADRIAN');
APPLY BATCH;
Upvotes: 1