Reputation: 3208
I wonder if there is a way to get the data location from hive using a one-liner. Something like
select d.location from ( describe formatted table_name partition ( .. ) ) as d;
My current solution is to get the full output and then parse it.
Upvotes: 3
Views: 2365
Reputation: 38335
Two methods if you do not have access to the metadata.
Parse DESCRIBE TABLE
in the shell like in this answer: https://stackoverflow.com/a/43804621/2700344
Also Hive has a virtual column INPUT__FILE__NAME.
select INPUT__FILE__NAME from table
will output locations URLs for each file. You can split URL by '/', get element you need, aggregate, etc
Upvotes: 0
Reputation: 1484
Unlike traditional RDBMS, Hive metadata is stored in a separate database. In most cases it is in MySQL or Postgres. The metastore database details can be found in hive-site.conf. If you have access to the metastore database, you can run SELECT on table TBLS to get the details about the tables and COLUMNS_V2 to get the details about columns etc..
If you do not have access to the metastore, the only option is to describe each table to get the details. If you have a lot of databases and tables, you could write a shell script to get the list of tables using "show tables" and loop around the tables.
Upvotes: 1