MBZ
MBZ

Reputation: 27592

Best equivalent of SQL Server UPDATE command in Hive

What is the best (less expensive) equivalent of SQL Server UPDATE SET command in Hive?

For example, consider the case in which I want to convert the following query:

UPDATE TABLE employee 
SET visaEligibility = 'YES' 
WHERE experienceMonths > 36

to equivalent Hive query.

Upvotes: 1

Views: 2086

Answers (3)

USB
USB

Reputation: 6139

You can upgrade your hive to 0.14.0

Starting from 0.14.0 hive supports UPDATE operation.

To do the same we need to create hive tables such that they support ACID output format and need to set additional properties in hive-site.xml.

How to do CURD operations in Hive

Upvotes: 0

user2118965
user2118965

Reputation: 1

You can create an external table and use the 'insert overwrite into local directory' and in case you want to change the column values, you can use 'CASE WHEN', 'IF' or other conditional operations. And copy the output file back to HDFS location.

Upvotes: 0

Charles Menguy
Charles Menguy

Reputation: 41428

I'm assuming you have a table without partitions, in which case you should be able to do the following command:

INSERT OVERWRITE TABLE employee SELECT employeeId,employeeName, experienceMonths ,salary, CASE WHEN experienceMonths >=36 THEN ‘YES’ ELSE visaEligibility END AS visaEligibility FROM employee;

There are other ways but they are much more convoluted, I think the way Bejoy described is the most efficient.

(source: Bejoy KS blog)

Note that if you have to do this on a partitioned table (which is likely if you have a lot of data), you would probably need to overwrite your partition when doing this.

Upvotes: 2

Related Questions