bukli
bukli

Reputation: 172

Spark SQL external table (hive support) - Find the location 'path' of the external (blob storage) table, within the metastore db

I have setup a standalone hive-metastore(v3.0.0) backed by postgres and created external tables within spark sql. The external data location is in azure blob. I am able to query these tables by using dbname.tablename instead of the actual location. These are non partitioned tables.

When I check in the metastore postgres db,TBLS table, I can see the field TBL_TYPE set as EXTERNAL_TABLE and there is a key SD_ID which maps to the table SDS. The SDS table has a LOCATION field, which doesn't show the actual blob location. Instead it shows the database path with a PLACEHOLDER appended to it.

Location
file:/home/ash/mydatabase/youraddress4-__PLACEHOLDER__ 

The above local location doesn't even exist.

How does spark or hive metastore resolves the actual location of the tables and where is it actually stored in the metastore database?

Upvotes: 5

Views: 449

Answers (1)

Blue Yankee
Blue Yankee

Reputation: 1

Had the same question (using Hive 4.0.0 though). Ended up doing a pg_dump and then searched for the expected abfss:// path in the output file. Found the expected abfss:// path in the SERDE_PARAMS table, so presumably Hive uses that to resolve the actual location.

Upvotes: 0

Related Questions