Reputation: 33
How compact storage works in case of having a table like:
Table Index {
userid
keyword
score
fid
PRIMARY KEY (userid, keyword, score)
}
Do not pay attention on the syntax errors of my table:) let' assume there is one keyword containing 6 fID divided to 3 groups of different scores. How will cassandra store the data in physical layer?
Upvotes: 1
Views: 157
Reputation: 57798
To test this, I created your sample schema (using WITH COMPACT STORAGE
) with the above PRIMARY KEY, and ran these 6 INSERT
s:
INSERT INTO dontnameyourtableindex (userid, keyword, score,fid) VALUES (3,'Star Wars',87,1);
INSERT INTO dontnameyourtableindex (userid, keyword, score,fid) VALUES (3,'Star Wars',87,2);
INSERT INTO dontnameyourtableindex (userid, keyword, score,fid) VALUES (3,'Star Wars',21,3);
INSERT INTO dontnameyourtableindex (userid, keyword, score,fid) VALUES (3,'Star Wars',21,4);
INSERT INTO dontnameyourtableindex (userid, keyword, score,fid) VALUES (3,'Star Wars',44,5);
INSERT INTO dontnameyourtableindex (userid, keyword, score,fid) VALUES (3,'Star Wars',44,6);
Note that due to your PRIMARY KEY definition, I ended-up with these three CQL rows:
userid | keyword | score | fid
--------+--------------+-------+-----
3 | Star Wars | 21 | 4
3 | Star Wars | 44 | 6
3 | Star Wars | 87 | 2
(3 rows)
The thing with Cassandra PRIMARY KEYs is that they are unique. So if you want to ensure uniqueness down to fID, then you should make that it is the last part of the PRIMARY KEY...PRIMARY KEY (userid, keyword, score,fID) That will ensure uniqueness, and still allow you to sort by keyword and score.
To view how these are structured at the physical level, I can use the cassandra-cli
(instead of cqlsh):
[aploetz@unknown] use stackoverflow;
Authenticated to keyspace: stackoverflow
[default@stackoverflow] list dontnameyourtableindex ;
Using default limit of 100
Using default cell limit of 100
-------------------
RowKey: 3
=> (name=Star Wars:21, value=4, timestamp=1425307959946184)
=> (name=Star Wars:44, value=6, timestamp=1425307961062608)
=> (name=Star Wars:87, value=2, timestamp=1425307959909671)
Note that the WITH COMPACT STORAGE
keeps the fid
column name from appearing, and instead only shows the values with the corresponding column keys.
Upvotes: 1