Reputation: 172
I have setup a standalone hive-metastore(v3.0.0) backed by postgres and created external tables within spark sql. The external data location is in azure blob. I am able to query these tables by using dbname.tablename instead of the actual location. These are non partitioned tables.
When I check in the metastore postgres db,TBLS table, I can see the field TBL_TYPE set as EXTERNAL_TABLE and there is a key SD_ID which maps to the table SDS. The SDS table has a LOCATION field, which doesn't show the actual blob location. Instead it shows the database path with a PLACEHOLDER appended to it.
Location
file:/home/ash/mydatabase/youraddress4-__PLACEHOLDER__
The above local location doesn't even exist.
How does spark or hive metastore resolves the actual location of the tables and where is it actually stored in the metastore database?
Upvotes: 5
Views: 449
Reputation: 1
Had the same question (using Hive 4.0.0 though). Ended up doing a pg_dump
and then searched for the expected abfss://
path in the output file. Found the expected abfss://
path in the SERDE_PARAMS
table, so presumably Hive uses that to resolve the actual location.
Upvotes: 0