user5228393
user5228393

Reputation:

How does inserting into a Managed table in Hive affect the data in the other table?

If I have this Managed table in Hive below:

CREATE TABLE Employee (
   empId int,
   name VARCHAR(100))
COMMENT 'Employees'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘\t’
LINES TERMINATED BY ‘\n’
STORED AS TEXTFILE;

Now if I insert into the above table like this:

INSERT INTO TABLE Employee SELECT * FROM Employee_Test; 

Since the Employee table is a Managed table, when I issue the INSERT INTO command above will the data from Employee_Test be permanently moved into the Employee table Directory (which in this case would be /user/hive/warehouse since it's a managed Table)?

Therefore, the data will not reside in Employee_Test anymore?

Upvotes: 1

Views: 562

Answers (2)

OneCricketeer
OneCricketeer

Reputation: 191743

Will data from an external table be moved during a selection? No, it will not. The INSERT FROM SELECT query is not a "move", it's a "copy"

You're welcome to hadoop fs -ls the external table location to verify that.

The only way an external table's data is removed is by actually deleting the table location from disk.

which in this case would be /user/hive/warehouse

The warehouse directory is configurable, but I believe that's the default, yes

Upvotes: 1

Bala
Bala

Reputation: 11244

SELECT * FROM Employee_Test produces a set of rows (result set) that are then inserted into Employee. Therefore Employee_Test will continue to hold those rows. In contract LOAD DATA INPATH... would actually move the data.

Upvotes: 1

Related Questions