Reputation: 21
I want to load some of the files from a HDFS directory into a table.
The files in the HDFS directory as below.
/data/log/user1log.csv
/data/log/user2log.csv
/data/log/user3log.csv
/data/log/user4log.csv
/data/log/user5log.csv
Now I want to load /data/log/user1log.csv and /data/log/user2log.csv files.
I have tried the below.
CREATE EXTERNAL TABLE log_data (username string,log_dt string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
tblproperties ("skip.header.line.count"="1");
load data inpath '/data/log/user1log.csv' into table log_data;
load data inpath '/data/log/user2log.csv' into table log_data;
But after loading data into table files are vanishing from HDFS location. But the file we should keep in the HDFS location.
Please help me.
Thanks in advance.
Upvotes: 0
Views: 76
Reputation: 12900
I don't think it's possible, when you do Load inpath
it moves data rather than copying.
However, you have a External Table so you can load data even without using Load inpath
Here's how you can do it.
Specify the location for your Hive Table
CREATE EXTERNAL TABLE log_data (username string,log_dt string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
tblproperties ("skip.header.line.count"="1");
location '/data/log_data/table'
Copy Files to Location
hdfs dfs -cp /data/log/user1log.csv /data/log_data/table/
hdfs dfs -cp /data/log/user2log.csv /data/log_data/table/
Upvotes: 2