Reputation: 8878
Cassandra data modeling respects "Denormalization and duplication of data is a fact of life with Cassandra". But one of the cons for demormalized data is making the updates very hard. For example, if I have three tables catering for different queries, selecting is fine. However, if in my app, I want to update a username and I need to update these three tables? The update on first table looks ok. How about the latter two? The upates are going to be very expensive? How should I handle this case?
CREATE TABLE users_by_username (
username text PRIMARY KEY,
email text,
age int
)
CREATE TABLE users_by_email (
email text PRIMARY KEY,
username text,
age int
)
CREATE TABLE groups (
groupname text,
username text,
email text,
age int,
hash_prefix int,
PRIMARY KEY ((groupname, hash_prefix), username)
)
Upvotes: 0
Views: 459
Reputation: 8878
After watching a few youtube clips, it looks like Canssandra's update is a simple write to append a record to the commit log in the file system. Then the data is put to memtable in cassandra server and send acknowledge to the client straight away. So the update call finishes. This makes the updating fast to the clients.
The whole compaction process happens afterwards, including flushing, sequential writing and merging based on the timestamp.
Upvotes: 0
Reputation: 880
This is a typical problem I see when people try to put relational model in Cassandra which is being updated through time. Cassandra is a great database and for what it does, it works wonders. There are many features that enable all kinds of different data models and you can cover almost all use cases. When you look at your use case the question is why would you use Cassandra for relational model? If you really want to make Cassandra cover your use case you will have to do a lot of different operations on application level just to execute updates and keep your data in consistent state.
Upvotes: 1