Reputation: 2531
It is my maiden voyage into Hive. I have multiple Hive tables, like snapshots with names as follows:
revenue_20110131
reveue_20110228
revenue_20110331
purchases_qrt1
purchases_qrt2
purchases_qrt3
purchases_qrt4
I have a lot of such snapshot tables. Now, I need to build a script that takes a part of table name as the parameter and reads the records from all such similarly named tables and exports the entire data from all those tables into a single ORC file.
How to do this in Hive? I have no idea where to start as I've never worked on Hive before. Can someone please help me? Thanks in advance, guys.
Upvotes: 0
Views: 122
Reputation: 38290
If the tables have common upper sub-directory in their location, you can create new table using upper directory and select all of them in single select.
create table new tbl
...
location 'upper common directory path here'
then add these settings before select:
set hive.mapred.supports.subdirectories=TRUE;
set mapred.input.dir.recursive=TRUE;
Upvotes: 1