Shankar
Shankar

Reputation: 27

How to access one databricks delta tables from other databricks

I want to access one Databricks environment delta tables from other Databricks environment by creating global Hive meta store in one of the Databricks. Let me know if it is possible or not.

Thanks in advance.

Upvotes: 2

Views: 8143

Answers (1)

Alex Ott
Alex Ott

Reputation: 87174

There are two aspects here:

  1. The data itself - they should be available to other workspaces - this is done by having a shared storage account/container, and writing data into it. You can either mount that storage account, or use direct access (via service principal or AAD passtrough) - you shouldn't write data to built-in DBFS Root that isn't available to other workspaces. After you write the data using dataframe.write.format("delta").save("some_path_on_adls"), you can read these data from another workspace that has access to that shared workspace - this could be done either
  • via Spark API: spark.read.format("delta").load("some_path_on_adls")
  • via SQL using following syntax instead of table name (see docs):
delta.`some_path_on_adls`
  1. The metadata - if you want to represent saved data as SQL tables with database & table names instead of path, then you can use following choices:
  • Use the built-in metastore to save data into location on ADLS, and then create so-called external table in another workspace inside its own metastore. In the source workspace do:
dataframe.write.format("delta").option("path", "some_path_on_adls")\
  .saveAsTable("db_name.table_name")

and in another workspace execute following SQL (either via %sql in notebook or via spark.sql function:

CREATE TABLE db_name.table_name USING DELTA LOCATION 'some_path_on_adls'
  • Use external metastore that is shared by multiple workspaces - in this case you just need to save data correctly:
dataframe.write.format("delta").option("path", "some_path_on_adls")\
  .saveAsTable("db_name.table_name")

you still need to save it into shared location, so the data is accessible from another workspace, but you don't need to register the table explicitly, as another workspace will read the metadata from the same database.

Upvotes: 2

Related Questions