Connection to HIVE using bash

Question

I made a simple bash skript, which is supposed to count columns in every table in a specified database. The problem is, that the part

`hive -S -e 'use '$database'; describe '$word'' | grep -e '^[a-z]' | uniq |wc -l >>MY_PROJECTS/ColumnCountList.txt`

consumes a lot of time because of hive connection - everytime it counts columns in every next table, it connects to hive again.

Is there any way to connect to Hive once, and then just execute commands ? It would be much faster especially when im doing it for db with 1k tables. Code below:

#!/bin/bash
database=pos
tables_list=`hive -S -e 'use '$database'; show tables' | grep -e '^[a-z]' | uniq`
for word in $tables_list
do
   `echo -n "$word : ">>MY_PROJECTS/ColumnCountList.txt`
   `hive -S -e 'use '$database'; describe '$word'' | grep -e '^[a-z]' | uniq |wc -l >>MY_PROJECTS/ColumnCountList.txt`
done

tale852150 · Accepted Answer

Build a file ( here called desc_tbl_list.sql ) using your bash script that contains the describe command and table name, like so:

describe tablename1
describe tablename2
...

Then run the script using hive command containing all the describe commands like so:

hive -f desc_tbl_list.sql

HTH

Connection to HIVE using bash

Answers (2)

Related Questions