v83rahul
v83rahul

Reputation: 313

Search a table in all databases in hive

In Hive, how do we search a table by name in all databases?

I am a Teradata user. Is there any counterpart of systems tables (present in Teradata) like dbc.tables, dbc.columns which are present in HIVE?

Upvotes: 11

Views: 51275

Answers (7)

Kate V.
Kate V.

Reputation: 21

That's an extention of Mantej Singh's answer: you can use pyspark to find tables across all Hive databases (not just one):

from functools import reduce
from pyspark import SparkContext, HiveContext
from pyspark.sql import DataFrame

sc = SparkContext()
sqlContext = HiveContext(sc)

dbnames = [row.databaseName for row in sqlContext.sql('SHOW DATABASES').collect()]

tnames = []
for dbname in dbnames:
    tnames.append(sqlContext.sql('SHOW TABLES IN {} LIKE "%your_pattern%"'.format(dbname)))

tables = reduce(DataFrame.union, tnames)
tables.show()

The way to do this is to iterate through the databases searching for table with a specified name.

Upvotes: 2

ashok viswanathan
ashok viswanathan

Reputation: 117

Searching for tables with name containing infob across all Hive databases

for i in `hive -e "show schemas"`; do echo "Hive DB: $i"; hive -e "use $i; show tables"|grep "infob"; done

Upvotes: 3

Mantej Singh
Mantej Singh

Reputation: 410

You can use SQL like to search a table. Example: I want to search a table with the name starting from "Benchmark" I don't know the rest of it.

Input in HIVE CLI:

show tables like 'ben*'

Output:

+-----------------------+--+
|       tab_name        |
+-----------------------+--+
| benchmark_core_month  |
| benchmark_core_qtr    |
| benchmark_core_year   |
+-----------------------+--+
3 rows selected (0.224 seconds)

Or you can try below command if you are using Beeline

!tables

Note: It will work with Beeline only (JDBC client based)

More about beeline: http://blog.cloudera.com/blog/2014/02/migrating-from-hive-cli-to-beeline-a-primer/

Upvotes: 12

Oleg Svechkarenko
Oleg Svechkarenko

Reputation: 2516

@hisi's answer is elegant. However it induce an error with lacking memory for GC on our cluster. So, there is another less elegant approach that works for me.

Let foo is the table name to search. So

hadoop fs -ls -R -C /apps/hive/warehouse/ 2>/dev/null | grep '/apps/hive/warehouse/[^/]\{1,\}/foo$'

If one does not remember exact name of table but only substring bar in table name, then command is

hadoop fs -ls -R -C /apps/hive/warehouse/ 2>/dev/null | grep '/apps/hive/warehouse/[^/]\{1,\}/[^/]\{1,\}$' | grep bar

Upvotes: 1

HISI
HISI

Reputation: 4797

you can also use hdfs to find a table in all databases:

the path of hive databases is:

/apps/hive/warehouse/

so, by using hdfs :

hdfs dfs -find /apps/hive/warehouse/ -name t*

Upvotes: 5

Avi C
Avi C

Reputation: 56

Hive stores all its metadata information in Metastore. Metastore schema can be found at: link: https://issues.apache.org/jira/secure/attachment/12471108/HiveMetaStore.pdf

It has tables like DBS for database, TBLS for tables and Columns. You may use appropriate join, to find out table name or column names.

Upvotes: 2

David דודו Markovitz
David דודו Markovitz

Reputation: 44991

You should query the metastore.

You can find the connection properties within hive-site.xml

bash

<$HIVE_HOME/conf/hive-site.xml grep -A1 jdo

    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://127.0.0.1/metastore?createDatabaseIfNotExist=true</value>
--
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
--
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>hive</value>
--
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>cloudera</value>

Within the metastore you can use a query similar to the following

mysql

select          *

from                            metastore.DBS             as d

                join            metastore.TBLS            as t 

                on              t.DB_ID =
                                d.DB_ID  

where           t.TBL_NAME  like '% ... put somthing here ... %'

order by       d.NAME 
              ,t.TBL_NAME                                          
;

Upvotes: 3

Related Questions