Shivankur Pal
Shivankur Pal

Reputation: 163

run a bulk update query in cassandra on 1 column

we have a scenario where a table in cassandra which has over million records and we want execute a bulk update on a column(basically set the column value to null in entire table).

is there a way to do so since below query won't work in CQL

UPDATE TABLE_NAME SET COL1=NULL WHERE PRIMARY_KEY IN(SELECT PRIMARY_KEY FROM TABLE_NAME );

P.S - the column is not a primary key or a cluster key.

Upvotes: 2

Views: 2124

Answers (2)

JayK
JayK

Reputation: 800

There really isn't a way to do this through CQL short of iterating through each row and updating the value.

However, there might be a way to do this if you feel adventurous.

You could use COPY in cqlsh to output the data of the table to a file. With a tool like sed you can modify this text file to change the columns and then import that same file back into cassandra.

This solution is less than optimal and might not work for certain datasets, but it gets the job done.

Personally I would still prefer iterating over doing this.

Upvotes: 3

Horia
Horia

Reputation: 2982

There has been a similar question the other days regarding Deleting a column in cassandra for a large dataset...I suggest also reading the section Dropping a column from the Alter table documentation.

One solution in this case might be dropping the column and re-adding it since

If you drop a column then re-add it, Cassandra does not restore the values written before the column was dropped. A subsequent SELECT on this column does not return the dropped data.

I would test this on a test system beforehand and I would check if the tombstones have been removed.

Upvotes: 2

Related Questions