Charlie Parker
Charlie Parker

Reputation: 5231

How to make Cassandra have a varying column key for a specific row key?

I was reading the following article about Cassandra:

http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/#.UzIcL-ddVRw

and it seemed to imply you can have varying column keys in cassandra for a given row key. Is that true? And if its true, how do you allow for varying row keys.

The reason I think this might be true is because say we have a user and it can like many items and we simply want the userId to be the rowkey. We let this rowKey (userID) map to all the items that specific user might like. Each specific user might like a different number of items. Therefore, if we could have multiple column keys, one for each itemID each user likes, then we could solve the problem that way.

Therefore, is it possible to have varying length of cassandra column keys for a specific rowKey? (and how do you do it)

Providing an example and/or some cql code would be awesome!

The thing that is confusing me is that I have seen some .cql files and they define keyspaces before hand and it seems pretty inflexible on how to make it dynamic, i.e. allow it to have additional columns as we please. For example:

CREATE TABLE IF NOT EXISTS results (
    test blob,
    tid timeuuid,
    result text,
    PRIMARY KEY(test, tid)
);

How can this even allow growing columns? Don't we need to specify the name before hand anyway?Or additional custom columns as the application desires?

Upvotes: 2

Views: 416

Answers (2)

cs_alumnus
cs_alumnus

Reputation: 1659

Yes, you can have a varying number of columns per row_key. From a relational perspective, it's not obvious that tid is the name of a variable. It acts as a placeholder for the variable column key. Note in the inserts statements below, "tid", "result", and "data" are never mentioned in the statement.

CREATE TABLE IF NOT EXISTS results (
    data blob,
    tid timeuuid,
    result text,
    PRIMARY KEY(test, tid)
);

So in your example, you need to identify the row_key, column_key, and payload of the table. The primary key contains both the row_key and column_key.

Test is your row_key. tid is your column_key. data is your payload.

The following inserts are all valid:

INSERT your_keyspace.results('row_key_1', 'a4a70900-24e1-11df-8924-001ff3591711', 'blob_1');
INSERT your_keyspace.results('row_key_1', 'a4a70900-24e1-11df-8924-001ff3591712', 'blob_2');
#notice that the column_key changed but the row_key remained the same
INSERT your_keyspace.results('row_key_2', 'a4a70900-24e1-11df-8924-001ff3591711', 'blob_3');

See here

Upvotes: 2

vivek mishra
vivek mishra

Reputation: 1162

Did you thought of exploring collection support in cassandra for handling such relations in colocated way{e.g. on same data node}.

Not sure if it helps, but what about keeping user id as row key and a map containing item id as key and some value?

-Vivel

Upvotes: 1

Related Questions