user3305569
user3305569

Reputation: 123

how to load multiple files into Hive table?

I have many 10MB logs and i need to load this into HIVE. Later I need to add few more log files to the existing table. Can anyone please help me on this?

Upvotes: 1

Views: 26522

Answers (3)

Amar
Amar

Reputation: 3845

Why don't you create an external table in Hive by specifying some location and dump your files to that particular location. Your external table will automatically pick up any few files dumped in that folder (if the schema is the same).

Upvotes: 3

Dexter
Dexter

Reputation: 4285

A) Following command can be used multiple time to load multiple files:

LOAD DATA INPATH 'file_1/path/hdfs' INTO TABLE tablename;
LOAD DATA INPATH 'file_2/path/hdfs' INTO TABLE tablename;
.......

or

LOAD DATA LOCAL INPATH 'file_1/path/local' INTO TABLE tablename;
LOAD DATA LOCAL INPATH 'file_2/path/local' INTO TABLE tablename;

....

The INTO keyword append the data file after file. Don't use OVERWRITE by mistake.

B) When the files are in one directory:

LOAD DATA INPATH 'dir/path/hdfs' INTO TABLE tablename;
or,
LOAD DATA LOCAL INPATH 'dir/path/local' INTO TABLE tablename;

IMP: When the directory contains non data file (most likely in HDFS), above command throws error. For example, pig (or other tool) has generated a dir called my_data_dir. Under my_data_dir there are two data files /my_data_dir/part-m-00000 & /my_data_dir/part-m-00001 . There is also a log file named /my_data_dir/_logs

In this case, if you run above command it gives error mentioning the log file. Delete the log file and above command works fine.

Upvotes: 1

Santiago Cepas
Santiago Cepas

Reputation: 4094

Just use the standard Hive syntax:

LOAD DATA INPATH 'filepath' INTO TABLE tablename

Here filepath can refer to:

  • a relative path, such as project/data1
  • an absolute path, such as /user/hive/project/data1
  • a full URI with scheme and (optionally) an authority, such as hdfs://namenode:9000/user/hive/project/data1

filepath can be a directory, and all the files in that directory will be moved into the table.

Source: Hive Language Manual

Upvotes: 13

Related Questions