alianos-
alianos-

Reputation: 916

equivalent of batch statement for SELECT in cassandra

I am dealing with a weird issue and I am not sure if my design is correct. I've got a table that looks like this

CREATE TABLE Content (
  group_id bigint,
  content_id bigint,
  metadata blob,
  group_payload blob static,
  PRIMARY KEY (group_id, content_id)

As you can see the group_payload is static. If I need to fetch all the data for a given group_id I used to do so like this

SELECT * FROM Content WHERE group_id = X;

However this fetches group_payload multiple times and that causes both performance and memory issues as it is a fairly big blob.

As a result I've split the query in 2 as follows

SELECT group_payload WHERE group_id = X limit 1;
SELECT metadata WHERE group_id = X;

This worked wonders as a performance improvement, but it suffers from the occasional race condition, i.e. I get the group_payload, but by the time I get the metadata the group_payload is out of date.

Is there a way to "batch" the 2 select queries. should I maybe capture the inconsistency and retry (the data allows to detect this), or there is a better way to do this altogether?

Thanks

Upvotes: 1

Views: 822

Answers (2)

Shailesh
Shailesh

Reputation: 388

You won't be able to batch reads. If reading metadata is time consuming, then read payload after you read metadata. In case you want to check payload was updated after you fetched metadata, cassandra allows reading write-time of particular column but that doesn't look good solution.

What's interesting is, your use case need single payload from group and all metadata from group. Also, you are reading single partition for metadata, it should take few milli-seconds. If your payload is updating that frequently then you can think of outdated but consistent(based on write-time of metadata & payload) results.

Upvotes: 1

Alex Ott
Alex Ott

Reputation: 87154

Short answer - no - there is no such thing as batch for select as in Cassandra there is no snapshot isolation for data reading in Cassandra.

In your situation, I maybe would think about the logic of data processing - maybe it's ok to get all metadata first, and then get group payload?

Upvotes: 2

Related Questions