ymopur
ymopur

Reputation: 11

Access Azure Key Vault in Pandas read/write Azure Data Lake Storage Gen2 data in serverless Apache Spark pool in Synapse Analytics

Recently, Microsoft released a way for Pandas to read/write Azure Data Lake Storage Gen2 data in serverless Apache Spark pool in Synapse Analytics as per the below link: https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/tutorial-use-pandas-spark-pool

If I have to use the same strategy for pyspark in Azure DataBricks, how can I use the datalake secret (from Azure Key Vault) containing the account key so that pandas can access the data lake smoothly? In this way, I don't have to expose the secret value in DataBricks notebook

Upvotes: 1

Views: 388

Answers (1)

Alex Ott
Alex Ott

Reputation: 87069

for Azure Databricks you just need to create a secret scope out of the Azure KeyVault, and then you can use dbutils.secrets.get function to retrieve a secret from secret scope or ingest the secrets into a Spark conf.

Please note that you will need to set correct Spark configuration to use that storage account key refer to documentation for details (blob storage, ADLS Gen2)

Upvotes: 0

Related Questions