Reputation: 63
I want to update data for some columns in one table from another table.
For these three columns cf_mng,cf_sds,cf_htg
in the table cust_tabl
there is no data.
I want to update data for these three columns cf_mng,cf_sds,cf_htg
of cust_tabl
with the data of the three columns cust_cd_cnt_1,cust_cd_cnt_2,cust_cd_cnt_3
of custom_hist
table.
This table is having data from 201505 to 201509.
CREATE TABLE custom_hist(
cust_no varchar(20),
cust_cd_cnt_1 float,
cust_cd_cnt_2 float,
cust_cd_cnt_3 float,
cust_dt date,
cust_name string)
PARTITIONED BY (yyyymm int);
This table is having data from 201403 to 201606.
CREATE TABLE cust_tabl(
cust_no string,
cf_mng double,
cf_sds double,
cf_htg double,
cust_loc string,
cust_region string,
cust_country string,
cust_reg_id smallint)
PARTITIONED BY (yyyymm int);
Please help me.
Upvotes: 2
Views: 1821
Reputation: 38290
Join your tables by primary key and overwrite joined partitions. Check the primary key. The join cardinality should be 1:1 or 1:0 otherwise you should apply some row_number
or rank
or some aggregation like max()
to limit rows after join:
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;
insert overwrite table cust_tabl partition (yyyymm)
select
c.cust_no,
coalesce(h.cust_cd_cnt_1,c.cf_mng) as cf_mng, --take history column if joined
coalesce(h.cust_cd_cnt_2,c.cf_sds) as cf_sds, --take original if not joined
coalesce(h.cust_cd_cnt_3,c.cf_htg) as cf_htg,
c.cust_loc, --original columns
c.cust_region,
c.cust_country,
c.cust_reg_id,
c.yyyymm --partition is the last
from cust_tabl c
left join custom_hist h
--assume this is the primary key:
on c.cust_no = h.cust_no and c.yyyymm = h.yyyymm;
Upvotes: 2