Reputation: 1410
I want to insert a record in hive table but before inserting it i want to check if it that record already exists in the table. that case i won't insert the record and would rather skip it.
Hive version i am using is Hive 1.1.0.
Table structure:
hk | diagnosisid | id | cdcode
-----------------------------------------------------------
EXVWLOORV@#19690321F | 1 | 7810 | I9C
OHQQDUG@#19380630F | 1 | 3643 | I9C
VPLWK@#19610120F | 1 | 2731 | I9C
before inserting the record i want to check if the hk
value is already present or not in the hive table.
Upvotes: 0
Views: 3211
Reputation:
Approach:::1:
If you can write the "IF and LOOP" then you can try this approach.
select count(1) as cnt
from yourtable
where hk='new_value';
if (cnt=0){
---your insert query will be here
}else {
--some message... "hk" already present in table.
}
Approach:::2:
INSERT INTO your_table_name
(hk, diagnosisid, id, cdcode)
SELECT 'asdf' as hk,
'asdf' as diagnosisid,
'ss' as id,
'ddww' as cdcode
WHERE 'asdf'
NOT IN (SELECT hk FROM your_table_name)
Approach:::3:
Then finally you have to write ETL script to perform the actions.
Solution:
In your scenario: (as per table provided in comments) Query will be:
create table pid1(hk string,diagnosisid string,id string,cdcode string);
insert into pid1 values('EXVWLOORV@#19690321F','1','7810','I9C');
insert into pid1 values('OHQQDUG@#19380630F','1','3643','I9C');
insert into pid1 values('VPLWK@#19610120F','1','2731','I9C');
create table pid as select * from pid1 limit 1;
INSERT INTO pid
(hk, diagnosisid, id, cdcode)
SELECT hk,
diagnosisid,
id,
cdcode
from pid1
WHERE hk NOT IN (SELECT hk FROM pid);
All the best!!!
Upvotes: 1