Reputation: 48490
Trying to understand some fundamentals in Cassandra, I was under the impression that one of the advantages a developer can take in designing a data model is by dynamically adding columns to a row identified by a key. That means I can model my data so that if it makes sense, a key can be something such as a user_id from a relational database, and I can for example, create arbitrary amounts of columns that relate to that user.
What I'm not understanding is why there is so much emphasis to predefined columns in CLQ examples, particularly in the CREATE TABLE/COLUMNFAMILY examples:
CREATE TABLE emp (
empID int,
deptID int,
first_name varchar,
last_name varchar,
PRIMARY KEY (empID, deptID)
);
Wouldn't this type of model make more sense to just stuff into a relational database? What if I don't know my column name until runtime and need to dynamically create it? Do I have to use ALTER TABLE to add a new column to the row using CLQ? The particular app use-case I have in mind I would just need a key identifier and arbitrary column names where the column name might include a timestamp+variable_identifier.
Is Cassandra the right tool for that? Are the predefined columns in documentation nothing more than an example? How does one add a dynamic column name with an existing column family/table?
Upvotes: 3
Views: 3385
Reputation: 19377
My answer from the mailing list:
Schemalessness is not a fundemental concept to Cassandra, at all. You're probably suffering from too much exposure to document databases. Experience has shown that having schema to say "email column is text, and birth date column is a timestamp" is very useful as projects and teams grow.
There's nothing wrong with the relational model per se (subject to the usual explanation about needing to denormalize to scale). Cassandra is about making applications scale, not throwing the SQL baby out with the bathwater for the sake of being different.
That said, if you really don't know what kinds of attributes might apply (generally because they are user-generated) you can use a Map.
Upvotes: 1
Reputation: 14173
Do I have to use ALTER TABLE to add a new column to the row using CLQ?
Yes, the schema must be defined before you can insert into 'new columns'. However you could define 1 column that is a collection of data. Look at the 'tag' example in datastax's 'thrift to cql upgrade' blog under mixing dynamic and static columns.
How does one add a dynamic column name with an existing column family/table?
In CQL you have to first alter the structure of the table (column family) using the ALTER
keyword. My guess is that this is to ensure that column families contain the specified columns eliminating the chance of a column being added by mistake (better data quality).
Is Cassandra the right tool for that?
I think it is, but if you need to add columns on-the-fly without specifying schema altering statements then you should probably look into thrift based APIs which can do that, but just a friendly warning, datastax advise that new applications use CQL.
Upvotes: 4