Reputation: 7
I have data input like this:
Structure of csv file is:
x,y
0.0,0.0
18.6,-11.1
36.1,-21.9
53.7,-32.6
70.1,-42.8
86.5,-52.6
I want to load all file in this folder to Hive table like:
id, x, y, file_name, folder_name
1, 0.0, 0.0, 1.csv, driver_1
...
How can I do it? Can anyone help me please?
Upvotes: 0
Views: 496
Reputation: 2725
Hive has a virtual column named INPUT__FILE__NAME
that contains the full path to the input file that contained the record. Then using REGEXP_EXTRACT
we can extract out the parent directory and the filename:
SELECT
x
, y
, REGEXP_EXTRACT(INPUT__FILE__NAME, '.*/(.*)/(.*)', 2) AS file_name
, REGEXP_EXTRACT(INPUT__FILE__NAME, '.*/(.*)/(.*)', 1) AS folder_name
FROM
table
;
Upvotes: 2