Reputation: 43
I have a table like this
CREATE TABLE my_table(
category text,
name text,
PRIMARY KEY((category), name)
) WITH CLUSTERING ORDER BY (name ASC);
I want to write a query that will sort by name through the entire table, not just each partition.
Is that possible? What would be the "Cassandra way" of writing that query?
I've read other answers in the StackOverflow site and some examples created single partition with one id (bucket) which was the primary key but I don't want that because I want to have my data spread across the nodes by category
Upvotes: 3
Views: 585
Reputation: 42273
Always model cassandra tables through the access patterns (relational db / cassandra fill different needs).
Up to Cassandra 2.X, one had to model new column families (tables) for each access pattern. So if your access pattern needs a specific column to be sorted then model a table with that column in the partition/clustering key. So the code will have to insert into both the master table and into the projection table. Note depending on your business logic this may be difficult to synchronise if there's concurrent update, especially if there's update to perform after a read on the projections.
With Cassandra 3.x, there is now materialized views, that will allow you to have a similar feature, but that will be handled internally by Cassandra. Not sure it may fit your problem as I didn't play too much with 3.X but that may be worth investigation.
More on materialized view on their blog.
Upvotes: 2
Reputation: 9475
Cassandra doesn't support sorting across partitions; it only supports sorting within partitions.
So what you could do is query each category separately and it would return the sorted names for each partition. Then you could do a merge of those sorted results in your client (which is much faster than a full sort).
Another way would be to use Spark to read the table into an RDD and sort it inside Spark.
Upvotes: 4