Reputation: 1699
I'm working on a POC to showcase how Cassandra works. I took Digg as an example. I wanted to create a data model that'll let me:
1) Add links 2) Add a link to a user favorite list. 3) Attached predetermined tags to links
I came up with two Column Families:
Links
UserFavs
This works fine for requirements #1 and #2 above, but when I come to #3 it gets trickier. I can add tags like 'java', 'languages', 'architecture' as column names with empty values in the Links column family. But querying will take a long time, let's say if I were to find out all the links that were tagged under 'java'.
Can anyone throw some ideas of how this can be implemented.
If I'm not clear with the question, please let me know.
Thanks, Kumar
Upvotes: 2
Views: 1281
Reputation: 42597
You could create a secondary index, i.e. a column family keyed on tag. Each row contains all the links for that particular tag. Note that this may result in very wide rows (i.e. with many columns) each of which will be stored on a single cassandra node. You might want a scheme to split these up if they get very large.
See http://www.datastax.com/docs/0.7/data_model/cfs_as_indexes
or http://pkghosh.wordpress.com/2011/03/02/cassandra-secondary-index-patterns/
or google cassandra secondary index
Upvotes: 3