rant S
rant S

Reputation: 1

How do I update partial columns in clickhouse

I have a large table in clickhouse, and after adding a column with "alter table add column ed_code String", I want to refresh the column data according to the primary key. What should I do? Supplement: Refresh the data for this column alone. "ReplicatedReplacingMergeTree", need to add all of the fields "insert" again. "ReplicatedReplacingMergeTree" not according to the "column" to update?

"ReplicatedReplacingMergeTree"& "ReplicatedVersionedCollapsingMergeTree" Can't do it. Ask for help.

such as:

CREATE TABLE db_name1.tb_name1 ( ed_code String, dis_code Int64 , div_code Int64 , div_city String , div_code_region Int64, div_code_dept Int64 , dis_code_region Int64 , dis_code_city Int64 , dis_code_district Int64, .......100 column ) ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{cluster_1shard_s_all}/goldeneye/dim_ed_all_mapping_div', '{replica}', etl_tm) primary key ed_code ORDER BY (ed_code) SETTINGS storage_policy = 'policy_poll', index_granularity = 8192, use_minimalistic_part_header_in_zookeeper = 1;

There are already 1 billion pieces of data in the table above.
Now we need to add new column :

alter table db_name1.tb_name1 add column ed_status Int8 comment 'cabinet status[1:on,0:off,2:damage]';

I need to update the "ed_status" column of "db_name1.tb_name1" according to "ed_code"
I can only use all the column

insert into db_name1.tb_name1 (ed_code,dis_code.... ,ed_status) values(xx,xxx..... ,xxx)

to update?
Is that a waste of resources?
Why not just

insert into db_name1.tb_name1 (ed_code,ed_status) values(xx,xxx)

with the primary key and the field to be updated .

Upvotes: 0

Views: 453

Answers (1)

Slach
Slach

Reputation: 2488

Use asynchonous mutations

ALTER TABLE db.table UPDATE SET ed_code=if(primary_key_field=1,XXX,YYY)

Look at documentation about details:

Poll SELECT * FROM system.mutations WHERE is_done=0 AND database=? AND table=? for result

Upvotes: 0

Related Questions