Reputation: 11247
I am modeling my database in Cassandra, coming from RDBMS. I want to know how can I create a one-to-many relationship which is embedded in the same Column Name and model my table to fit the following query needs.
For example:
Boxes:{
23442:{
belongs_to_user: user1,
box_title: 'the box title',
items:{
1: {
name: 'itemname1',
size: 44
},
2: {
name: 'itemname2',
size: 24
}
}
},
{ ... }
}
I read that its preferable to use composite columns instead of super columns, so I need an example of the best way to implement this. My queries are like:
I am expecting lots of writes to change the size of each item in the box. I want to know the best way to implement it without the need to use super columns. Furthermore, I don't mind getting a solution that takes Cassandra 1.2 new features into account, because I will use that in production.
Thanks
Upvotes: 4
Views: 4844
Reputation: 2900
You can use Query-Driven Methodology, for data modeling.You have the three broad access paths:
1) partition per query
2) partition+ per query (one or more partitions)
3) table or table+ per query
The most efficient option is the “partition per query”. This article can help you in this case, step-by-step. it's sample is exactly a one-to-many relation.
And according to this, you will have several tables with some similar columns. You can manage this, by Materialized View or batch-log(as alternative approach).
Upvotes: 0
Reputation: 732
This particular model is somewhat challenging, for a number of reasons.
For example, with the box ID as a row key, querying for a range of boxes will require a range query in Cassandra (as opposed to a column slice), which means the use of an ordered partitioner. An ordered partitioner is almost always a Bad Idea.
Another challenge comes from the need to increment the item size, as this calls for the use of a counter column family. Counter column families store counter values only.
Setting aside the need for a range of box IDs for a moment, you could model this using multiple tables in CQL3 as follows:
CREATE TABLE boxes (
id int PRIMARY KEY,
belongs_to_user text,
box_title text,
);
CREATE INDEX useridx on boxes (belongs_to_user);
CREATE TABLE box_items (
id int,
item int,
size counter,
PRIMARY KEY(id, item)
);
CREATE TABLE box_item_names (
id int PRIMARY KEY,
item int,
name text
);
BEGIN BATCH
INSERT INTO boxes (id, belongs_to_user, box_title) VALUES (23442, 'user1', 'the box title');
INSERT INTO box_items (id, item, name) VALUES (23442, 1, 'itemname1');
INSERT INTO box_items (id, item, name) VALUES (23442, 1, 'itemname2');
UPDATE box_items SET size = size + 44 WHERE id = 23442 AND item = 1;
UPDATE box_items SET size = size + 24 WHERE id = 23442 AND item = 2;
APPLY BATCH
-- Get items for box by ID
SELECT size FROM box_items WHERE id = 23442 AND item = 1;
-- Boxes by user ID
SELECT * FROM boxes WHERE belongs_to_user = 'user1';
It's important to note that the BATCH mutation above is both atomic, and isolated.
Technically speaking, you could also denormalize all of this into a single table. For example:
CREATE TABLE boxes (
id int,
belongs_to_user text,
box_title text,
item int,
name text,
size counter,
PRIMARY KEY(id, item, belongs_to_user, box_title, name)
);
UPDATE boxes set size = item_size + 44 WHERE id = 23442 AND belongs_to_user = 'user1'
AND box_title = 'the box title' AND name = 'itemname1' AND item = 1;
SELECT item, name, size FROM boxes WHERE id = 23442;
However, this provides no guarantees of correctness. For example, this model makes it possible for items of the same box to have different users, or titles. And, since this makes boxes
a counter column family, it limits how you can evolve the schema in the future.
Upvotes: 2
Reputation: 20204
I think in PlayOrm's objects first, then show the column model below....
Box {
@NoSqlId
String id;
@NoSqlEmbedded
List<Item> items;
}
User {
@NoSqlId
TimeUUID uuid;
@OneToMany
List<Box> boxes;
}
The User then is a row like so
rowkey = uuid=<someuuid> boxes.fkToBox35 = null, boxes.fktoBox37=null, boxes.fkToBox38=null
Note, the form of the above is columname=value where some of the columnnames are composite and some are not.
The box is more interesting and say Item has fields name and idnumber, then box row would be
rowkey = id=myid, items.item23.name=playdo, items.item23.idnumber=5634, itesm.item56.name=pencil, items.item56.idnumber=7894
I am not sure what you meant though on get the top 20 boxes? top boxes meaning by the number of items in them?
Dean
Upvotes: 0