Reputation: 2531
It's pretty straightforward what I'm trying to do. I just need to count the records in multiple Hive tables.
I want to create a very simple hql
script that takes a file.txt with table names as input and count the total number of records in each of them:
SELECT COUNT(*) from <tablename>
Output should be like:
table1 count1
table2 count2
table3 count3
I'm new to Hive and not very well versed in Unix scripting, and I'm unable to figure out how to create a script to perform this.
Can someone please help me in doing this? Thanks in advance.
Upvotes: 2
Views: 1921
Reputation: 38325
Simple working shell script:
db=mydb
for table in $(hive -S -e "use $db; show tables;")
do
#echo "$table"
hive -S -e "use $db; select '$table' as table_name, count(*) as cnt from $table;"
done
You can improve this script and generate file with select commands or even single select with union all
, then execute file instead of calling Hive for each table.
If you want to read table names from file, use this:
for table in filename
do
...
done
Upvotes: 4