Peaches
Peaches

Reputation: 48

Alternative ways to load files from HDFS to hive when they are not in directory

ROW FORMAT DELIMITED FIELDS TERMINATED BY '${database_delimiter}'
LINES TERMINATED BY '\n' STORED AS TEXTFILE
LOCATION '${database_location}/Person';

Here person is expected to be a directory. Whereas person is a part-m file and not a directory.

Upvotes: 0

Views: 186

Answers (1)

OneCricketeer
OneCricketeer

Reputation: 191973

If I understand the question correctly, Hive will indeed fail to create a table over a file. It needs to be a directory location.

Therefore, whatever process you have needs to make said directory.

For example, whatever mapper process you have, you needed to specify an output directory, and if you failed to do that, then your files are placed in some location next to other files. (MapReduce should fail saying the destination directory already exists, though).

What you could do is move all part files into a new location

$ hdfs dfs -mkdir -p ${database_location}/Person/
$ # create hive table using that location
$ hdfs dfs -mv  ${database_location}/part-m* ${database_location}/Person/
$ # run hive query

Or, if you had raw files, you can do something similar

$ hdfs dfs -mkdir -p ${database_location}/Person/
$ # create hive table using that location
$ hdfs dfs -put somefile ${database_location}/Person/
$ # run hive query

Or use LOCAL DATA INPATH to read from one HDFS location to a Hive table

Upvotes: 1

Related Questions