Dumb Guy
Dumb Guy

Reputation: 23

hive, ask for files within specific range

Suppose on HDFS I have file with following content: data1-2018-01-01.txt, data1-2018-01-02.txt, data1-2018-01-03.txt, data1-2018-01-04.txt, data1-2018-01-06.txt

Now I want to query files based on date:

select * from mytable where date > 2018-01-03 and date < 2018-01-06 ;

And my question: is it possible to create an external table just on these files satisfying my query? Or maybe you have any workaround?

I know, I could use partitions but they require to fetch the data manually when the new data set arrives.

Upvotes: 1

Views: 148

Answers (1)

leftjoin
leftjoin

Reputation: 38290

Put those file into a directory and create new table on top of it. Also Hive has INPUT__FILE__NAME virtual column, you can use it for filtering:

where INPUT__FILE__NAME like '%2018-01-03%'

Also it is possible to use substr or regexp_extract to get date from filename , then use IN or >, < to filter them.

Upvotes: 1

Related Questions