Reputation: 123
I have many 10MB logs and i need to load this into HIVE. Later I need to add few more log files to the existing table. Can anyone please help me on this?
Upvotes: 1
Views: 26522
Reputation: 3845
Why don't you create an external table in Hive by specifying some location and dump your files to that particular location. Your external table will automatically pick up any few files dumped in that folder (if the schema is the same).
Upvotes: 3
Reputation: 4285
A) Following command can be used multiple time to load multiple files:
LOAD DATA INPATH 'file_1/path/hdfs' INTO TABLE tablename;
LOAD DATA INPATH 'file_2/path/hdfs' INTO TABLE tablename;
.......
or
LOAD DATA LOCAL INPATH 'file_1/path/local' INTO TABLE tablename;
LOAD DATA LOCAL INPATH 'file_2/path/local' INTO TABLE tablename;
....
The INTO keyword append the data file after file. Don't use OVERWRITE by mistake.
B) When the files are in one directory:
LOAD DATA INPATH 'dir/path/hdfs' INTO TABLE tablename;
or,
LOAD DATA LOCAL INPATH 'dir/path/local' INTO TABLE tablename;
IMP: When the directory contains non data file (most likely in HDFS), above command throws error. For example, pig (or other tool) has generated a dir called my_data_dir. Under my_data_dir there are two data files /my_data_dir/part-m-00000 & /my_data_dir/part-m-00001 . There is also a log file named /my_data_dir/_logs
In this case, if you run above command it gives error mentioning the log file. Delete the log file and above command works fine.
Upvotes: 1
Reputation: 4094
Just use the standard Hive syntax:
LOAD DATA INPATH 'filepath' INTO TABLE tablename
Here filepath can refer to:
project/data1
/user/hive/project/data1
hdfs://namenode:9000/user/hive/project/data1
filepath can be a directory, and all the files in that directory will be moved into the table.
Source: Hive Language Manual
Upvotes: 13