Reputation: 428
I have at a certain point in my software a list of primary keys of which I want to retrieve information from a massively huge table, and I'm wondering what's the most practical way of doing this. Let me illustrate:
Let this be my table structure:
CREATE TABLE table_a(
name text,
date datetime,
key int,
information1 text,
information2 text,
PRIMARY KEY ((name, date), key)
)
say I have a list of primary keys:
list = [['Jack', '2015-01-01 00:00:00', 1],
['Jack', '2015-01-01 00:00:00', 2],
['Richard', '2015-02-14 00:00:00', 5],
['David', '2015-01-01 00:00:00', 9],
...
['Last', '2014-08-13 00:00:00', 12]]
Say this list is huge (hundreds of thousands) and not ordered in any way. I want to retrieve, for every key on the list, the value of the information
columns.
As of now, the way I'm solving this issue is executing a select query for each key, and that has been sufficient hitherto. However I'm worried about execution times when the list of keys get too huge. Is there a more practical way of querying cassandra for a list of rows of which I know the primary keys without executing one query per key?
If the key was a single field, I could use the select * from table where key in (1,2,6,3,2,4,8)
syntax to obtain all the keys I want in one query, however I don't see how to do this with composite primary keys.
Any light on the case is appreciated.
Upvotes: 1
Views: 1669
Reputation: 57843
The best way to go about something like this, is to run these queries in parallel. You can do that on the (Java) application side by using async futures, like this:
Future<List<ResultSet>> future = ResultSets.queryAllAsList(session,
"SELECT * FROM users WHERE id=?",
UUID.fromString("0a63bce5-1ee3-4bbf-9bad-d4e136e0d7d1"),
UUID.fromString("7a69657f-39b3-495f-b760-9e044b3c91a9")
);
for (ResultSet rs : future.get()) {
... // process the results here
}
Upvotes: 1
Reputation: 66
Create a table that has the 3 columns worth of data piped together into a single value and store that single string value in a single column. Make that column the PK. Then you can use the IN clause to filter. For example, select * from table where key IN ('Jack|2015-01-01 00:00:00|1', 'Jack|2015-01-01 00:00:00|2').
Hope that helps! Adam
Upvotes: 0