Reputation: 155
Let say i have users. Those users can have access to multiple projects. So a project can also allow multiple users.
So I model four tables. users (by_id), projects (by id), projects_by_user_id and users_by_project_id.
----------- ------------ -------------------- -------------------- | users | | projects | | projects_by_user | | users_by_project | |---------| |--------- | |------------------| |------------------| | id K | | id K | | user_id K | | project_id K | | name | | name | | project_id C | | user_id C | ----------- ------------ | project_name S | | user_name S | -------------------- --------------------
So storing the user_name in the users_by_project and the projet_name in the projects_by_user table for querying.
The problem I have is when an user updates the project_name, this will of course update the projects table. But for data consistency I also need to update each partition in the projects_by_user table.
As far as I can see, this is only possible by querying all the users from the users_by_project table and doing an update for each user.
Is there any better way without first reading lots of data?
Upvotes: 1
Views: 219
Reputation: 9475
I don't see why you need four tables. Your users and projects tables could contain all of the data.
If you define the tables like this:
CREATE TABLE users (
user_id int PRIMARY KEY,
name text,
project_ids list<int> );
CREATE TABLE projects (
project_id int PRIMARY KEY,
name text,
user_ids list<int> );
Then each user would have a list of project ids they have access to, and each project would have a list of users that have access to it.
To add access to project 123 to user 1 you would run:
BEGIN BATCH
UPDATE users SET project_ids = project_ids + [123] WHERE user_id=1;
UPDATE projects SET user_ids = user_ids + [1] WHERE project_id=123;
APPLY BATCH;
To change a project name, you would just do:
UPDATE projects SET name = 'New project name' WHERE project_id=123;
For simplicity I showed the id fields as int's, but normally you would use uuid's for that.
Upvotes: 0
Reputation: 2138
I don't think there is better way. Cassandra has a lot of limitation on the queries you can make. In your case, you have to create a compound key (user_id, project_id), and in order to update it you have to provide both parts in where clause, which means you have to read all users for specific project and update each of these. If you have a large database and this scenario will happen often, this would be significant overhead, so I guess it would be better to remove projectname field from the table and perform join of the projects and projects_by_users at the application level.
BTW: Scenario you described here is more convenient for relational database model, so if the rest of your database model is similar to this, I would think of using some relational database instead.
Upvotes: 0