venkat
venkat

Reputation: 41

How to load data to hive from HDFS without removing the file

I have multiple files in single HDFS folder. I want to load each file into different hive table and want to keep the source files in same location.

I know we can create external table pointing to the directory.

is it possible to create external table pointing to particular file?

Can any one please help me resolve this issue.

Upvotes: 2

Views: 999

Answers (2)

leftjoin
leftjoin

Reputation: 38335

External table always has it's own location (a folder). Copy file to table location using hadoop distcp <srcurl> <desturl> command or hdfs dfs -cp .... See https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/FileSystemShell.html#cp for reference

Upvotes: 0

Samson Scharfrichter
Samson Scharfrichter

Reputation: 9067

If you have a predefined number of files with predefined names, you might try a multi-table INSERT, with WHERE clauses based on the INPUT__FILE__NAME virtual column.

FROM some_db.some_external_table
INSERT INTO table1
  SELECT a, b, c
  WHERE INPUT__FILE__NAME like '%/gabuzomeu.csv'
INSERT INTO table2
  SELECT a, x, d, CAST(z AS FLOAT)
  WHERE INPUT__FILE__NAME like '%/wtf.csv'
...

Reference:

Upvotes: 1

Related Questions