Reputation: 91
I have a Local directory it is used to store the hive table data.
I need to list all tables which are using Local directory .
These tables (managed tables) are stored in hive Default DB , this DB allows to store Data in other Local directories .
My Local directory : /abc/efg/data/
Table data is Stored in sub folders like 123 , 456,789 etc
For table xyz location is /abc/efg/data/123 , PQR location is /abc/efg/data/456 like that.
I am trying to use
hive -e " show tables " > All_tables list all tables and redirect to a file
For each line(each table) in All_tables
hive -e " desc formatted $line " | grep '/abc/efg/data/' >> Tables_My_local_dir
but it will result some performance issue as i have 6000 tables in DB .
please help me to list all tables which are using Local directory with a best performance.
Upvotes: 0
Views: 3442
Reputation: 91
Based on the rule
HADOOP TABLES ARE DIRECTORIES
I have created a shell script to do the below steps.
Step 1. Find all the directories which are not being modified since last 14 days .
Step 2 . Separate real tables and real folders 2.1execute "desc $dir_name "
2.2 based on return status($?) redirect $dir_name to two files(one for real tables and other for directories )
Now I have the required tables in a file.
Upvotes: 0
Reputation: 2415
I assume that you wanted to list table and its corresponding location information by extracting it from the desc formatted command for managed tables in default database.
If my understanding is correct, I suggest you to go with querying the Hive Meta-store, provided its an externally configured one and you have necessary permissions to fetch the same information
Query on meta-store:
SELECT T.TBL_NAME AS TABLE_NAME,S.LOCATION AS LOCATION FROM TBLS T LEFT JOIN SDS S ON T.SD_ID=S.SD_ID WHERE T.TBL_TYPE='MANAGED_TABLE' AND T.DB_ID=1 ;
note: in the query, DB_ID for default database is 1
Output:
------------+------------------------------------------------------------+
| TABLE_NAME | LOCATION |
+------------+------------------------------------------------------------+
| sample | hdfs://********:8020/user/hive/warehouse/sample |
...
.
Upvotes: 1