Reputation: 352
Assuming an object as following:
performance {
userid,
date,
score1,
score2,
score3,
...
}
I want to be able to query performance
s sorted on either of the score fields. Should I create different tables for each score
field as compound key or is there a better way to do this?
I hesitate to duplicate performance
objects for each score
field, as there may be many score
fields.
Upvotes: 0
Views: 1738
Reputation: 339
In the performace table, you could consider to create secondary indexes on score1, score2 and score3. However, Cassandra's built-in secondary indexes are best on a column family having many rows that contain the indexed value. So, you have to balance your use case, avoid to use secondary indexes to query a huge volume of records for a small number of results. In the purpose of sorting, this is not a good solution.
In best practice, using roughly one table per query pattern is recommended. Data duplication is fine. It also helps to identify the most frequent query patterns and isolate the less frequent. Some queries might be executed only a few thousand times, while others a billion times. Also consider which queries are sensitive to latency and which are not. It is recommended to think and balance two high-level rules while modelling: spreading data evenly around cluster and minimising the number of partitions read.
Please see also: Basic Rules of Cassandra Data Modeling
Upvotes: 1
Reputation: 76
I had the same problem . and to solve it I duplicated the tables . this because for me it was necessary that in the where clause I could write where ... and SCORE1 > 0 and SCORE1 < 100
. if you instead need only to use the equality operator , you can try to create an index , without duplicating the table .
if you need to sort the results based on those three columns at once then you can create the table with WITH CLUSTERING WITH ORDER BY ( score1 DESC,score2 DESC,score3 DESC);
and declare as clustering column, columns for which you want to order
Upvotes: 1