Ka Mok
Ka Mok

Reputation: 1988

How do you UPDATE a Cassandra column without directly knowing the primary key?

Given a scenario where you have a User table, with id as PRIMARY KEY. You have a column called email, and a column called name.

You want to UPDATE User.name based on User.email

I realized that the UPDATE command requires you to pass in a PRIMARY KEY. Does this mean I can't use a pure CQL migration, and would need to first query for the User.id primary key before I can UPDATE?

In this case, I DO know the PRIMARY KEY because the UUIDs are the same for dev and prod, but it feels dirty.

Upvotes: 1

Views: 3713

Answers (2)

Jim Witschey
Jim Witschey

Reputation: 285

I think Alex's answer covers your question -- "how can I find a value in a PK column working backwards from a non-PK column's value?".

However, I think it's worth noting that asking this question indicates you should reconsider your data model. A rule of thumb in C* data model design is that you begin by considering the queries you need, and you've missed the UPDATE query use case. You can probably make things work without changing your model for now, but if you find you need to make other queries you're unprepared for, you'll run into operational issues with lots of indexes and/or MVs.

More generally, search around for articles and other resources about Cassandra data modeling. It sounds like you're basically using C* for a relational use case so you'll want to look into that.

Upvotes: 1

Alex Ott
Alex Ott

Reputation: 87154

Yes, you're correct - you need to know primary key of the record to perform an update on the data, or deletion of specific record. There are several options here, depending of your data model:

  1. Perform full scan of the table using effective token range scan (Look to this answer for more details);
  2. If this is required very often, you can create a materialized view, with User.email as partition key, and fetch all message IDs that you can update (but you'll need to do this from your application, there is no nested query support in CQL). But also be aware that materialized views are "experimental" feature in Cassandra, and may not work all the time (it's more stable in DataStax Enterprise). Also, if you have some users with hundreds of thousands of emails, this may create big partitions.
  3. Do like 2nd item with your code, by using an additional table

Upvotes: 4

Related Questions