Sagi2313
Sagi2313

Reputation: 105

Updating column by calculation in cassandra

I am trying to add a new column in cassandra table to aid in some client side operations. The table is already there, and it has data in it. The primary key is a bigint.

The idea is to add a column on the table that would create domains of the primary key.

So the new column here is a int that must have the value of the <primary-key % 10>, which will give me 10 possible values for the new column, such as 10 pages based on the last digit of the primary key. The primary key is named idx and the new column is called idx_page.

One idea was to add the new 'idx_page' column as CLUSTERING column, but this didnt seem to work, using an ALTER TABLE statement. IF this is supposed to work, then I am probably doing something wrong with my cql statement for altering the table.

Next thing I tried was to add new column, then add an index on that, because eventually client side will need to do SELECT based on the value of that column in order to fetch all rows that have a specific value on the new column. This worked fine.

Regardless of whether idx_page would be a clustering column or a plain column with index on it, there are still 2 issues to solve:

  1. existing rows have a NULL "value" on the new column (which was expected)
  2. I have to figure out the best way that when a new row is inserted, the new column will be assigned with a value that is calculated based on primary key, preferably not by changing all the client side code to do this manually.

I guess I could get around both issues by using cassandra FUNCTIONS and AGGREGATES ?

Ideally I would like to be able to call some function manually once, in order to update all existing rows, and assign a value to idx_page, which is always calculated by (idx % 10).

Then upon INSERT, I would like to have the same logic run automatically so that idx_page gets a value at the insertion phase.

I am not sure how to proceed with this, or even if I am on the right path with UDF/UDA.

Any comments or advice please.

Alex

Upvotes: 0

Views: 521

Answers (1)

GAK
GAK

Reputation: 1078

You can build a custom function to accomplish this.

https://docs.datastax.com/en/cql/3.3/cql/cql_using/useCreateUDF.html

CREATE FUNCTION IF NOT EXISTS moduluo (input bigint) 
   CALLED ON NULL INPUT 
   RETURNS int 
   LANGUAGE java AS '
     return input % 10;
     ';

Upvotes: 1

Related Questions