Reputation: 21
I am newbie to Hadoop and Hive. My current requirement is to collect the stats of number of records loaded in 15 tables on each run day. Instead of executing each select Count(*)
query and copy output manually to XL. Could anyone suggest what is the best method to automate this task please?
Note: we are not having any GUI to run Hive Queries, submitting Hive queries in normal Unix terminal.
Upvotes: 2
Views: 59
Reputation: 38290
Export to the CSV or TSV file, then open file in Excel. Normally it generates TSV file (tab-separated). This is how to transform it to comma-separated if you prefer CSV;
hive -e "SELECT 'table1' as source, count(*) cnt FROM db.table1
UNION ALL
SELECT 'table2' as source, count(*) cnt FROM db.table2" | tr "\t" "," > mydata.csv
Add more tables to the query. You can mount directory in which you are writing output file in Windows using SAMBA/NFS. Schedule the command using crontab and voila, every day you have updated file.
Also you can connect directly using ODBC drivers:
https://mapr.com/blog/connecting-apache-hive-to-odbc/
Error connecting Hortonworks Hive ODBC in Excel 2013
Upvotes: 3