Reputation: 3031
I searched on google but didn't got any links other than Cassandra read-side documentation page. So, I just want to ask if there's any API or function already included in Akka-Cassandra package for batch row inserting or I have to call the insert code multiple times for multiple row insertion.
Note:- I am not asking about inserting multiple events, I just want to store some json data in Key-Pair format. So single event containing Json object might need multiple rows. In PHP and other languages we can supply a Array having multiple rows, but how does Akka's Cassandra driver implementation offer this?
Upvotes: 0
Views: 462
Reputation: 441
CassandraSession
exposes everything you need for batch writes, namely CassandraSession#prepare
followed by CassandraSession#executeWriteBatch
.
Something like this:
PreparedStatement ps = session.prepare(...);
BatchStatement batch = new BatchStatement();
batch.add(ps.bind(...));
batch.add(ps.bind(...));
session.executeWriteBatch(batch);
That said, notice that read side handlers built using CassandraReadSide
need to return a List<BoundStatement>
from the event handler methods. Lagom will automatically execute these statements in a batch.
Upvotes: 1
Reputation: 925
Lagom's Read Side processes events one at a time. The only scenario where a batch insert would be possible is if you keep events in memory and persist the batch past a timeout or when the set is big enough. This approach is prone to data loss (or at-most-once semantics) because in case of crash the event stream will consider the event consumed but the data in memory will not be persisted.
Lagom defaults make each event processing a single transaction that includes user-provided code to update the read-side tables and the offset store within lagom. This approach allows for effectively-once read-side processing when all the operations provided by the user happen within the transaction.
The suggested approach, at the moment, is to shard your persistent entity tag so that your persistent entity eventstream can be consumed from many read side processor instances in parallel. With that solution, each instance will process the events one at a time but many instances will be distributed across your cluster.
Upvotes: 0