Reputation: 27592
What is the best (less expensive) equivalent of SQL Server UPDATE SET
command in Hive
?
For example, consider the case in which I want to convert the following query:
UPDATE TABLE employee
SET visaEligibility = 'YES'
WHERE experienceMonths > 36
to equivalent Hive
query.
Upvotes: 1
Views: 2086
Reputation: 6139
You can upgrade your hive to 0.14.0
Starting from 0.14.0 hive supports UPDATE operation.
To do the same we need to create hive tables such that they support ACID output format and need to set additional properties in hive-site.xml.
How to do CURD operations in Hive
Upvotes: 0
Reputation: 1
You can create an external table and use the 'insert overwrite into local directory' and in case you want to change the column values, you can use 'CASE WHEN', 'IF' or other conditional operations. And copy the output file back to HDFS location.
Upvotes: 0
Reputation: 41428
I'm assuming you have a table without partitions, in which case you should be able to do the following command:
INSERT OVERWRITE TABLE employee SELECT employeeId,employeeName, experienceMonths ,salary, CASE WHEN experienceMonths >=36 THEN ‘YES’ ELSE visaEligibility END AS visaEligibility FROM employee;
There are other ways but they are much more convoluted, I think the way Bejoy described is the most efficient.
(source: Bejoy KS blog)
Note that if you have to do this on a partitioned table (which is likely if you have a lot of data), you would probably need to overwrite your partition when doing this.
Upvotes: 2