Reputation: 1
I have a large table in clickhouse, and after adding a column with "alter table add column ed_code String", I want to refresh the column data according to the primary key. What should I do? Supplement: Refresh the data for this column alone. "ReplicatedReplacingMergeTree", need to add all of the fields "insert" again. "ReplicatedReplacingMergeTree" not according to the "column" to update?
"ReplicatedReplacingMergeTree"& "ReplicatedVersionedCollapsingMergeTree" Can't do it. Ask for help.
such as:
CREATE TABLE db_name1.tb_name1 (
ed_code
String,dis_code
Int64 ,div_code
Int64 ,div_city
String ,div_code_region
Int64,div_code_dept
Int64 ,dis_code_region
Int64 ,dis_code_city
Int64 ,dis_code_district
Int64, .......100 column ) ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{cluster_1shard_s_all}/goldeneye/dim_ed_all_mapping_div', '{replica}', etl_tm) primary key ed_code ORDER BY (ed_code) SETTINGS storage_policy = 'policy_poll', index_granularity = 8192, use_minimalistic_part_header_in_zookeeper = 1;
There are already 1 billion pieces of data in the table above.
Now we need to add new column :
alter table db_name1.tb_name1 add column ed_status Int8 comment 'cabinet status[1:on,0:off,2:damage]';
I need to update the "ed_status" column of "db_name1.tb_name1" according to "ed_code"
I can only use all the column
insert into db_name1.tb_name1 (ed_code,dis_code.... ,ed_status) values(xx,xxx..... ,xxx)
to update?
Is that a waste of resources?
Why not just
insert into db_name1.tb_name1 (ed_code,ed_status) values(xx,xxx)
with the primary key and the field to be updated .
Upvotes: 0
Views: 453
Reputation: 2488
Use asynchonous mutations
ALTER TABLE db.table UPDATE SET ed_code=if(primary_key_field=1,XXX,YYY)
Look at documentation about details:
Poll SELECT * FROM system.mutations WHERE is_done=0 AND database=? AND table=?
for result
Upvotes: 0