MystCabe
MystCabe

Reputation: 57

Databricks - Connect to Azure Data Lake Storage Gen2 and Blob Storage - Sas Token on a Directory Level

I have been following this guide Connect to Azure Data Lake Storage Gen2 and Blob Storage - Sas Tokens

spark.conf.set("fs.azure.account.auth.type.<storage-account>.dfs.core.windows.net", "SAS") 

spark.conf.set("fs.azure.sas.token.provider.type.<storage-account>.dfs.core.windows.net", "org.apache.hadoop.fs.azurebfs.sas.FixedSASTokenProvider") 

spark.conf.set("fs.azure.sas.fixed.token.<storage-account>.dfs.core.windows.net", dbutils.secrets.get(scope="<scope>", key="<sas-token-key>"))

It is then possible to read the files

spark.read.load("abfss://<container-name>@<storage-account-name>.dfs.core.windows.net/<path-to-data>")

and list the files

dbutils.fs.ls("abfss://<container-name>@<storage-account-name>.dfs.core.windows.net/<path-to-data>")

And it works fine when I create the SAS Token on a container level. However, I would like to create the SAS token on a directory level instead.

Listing the files work as previously, but I can no longer read the files. It gives this error

enter image description here

Are there any solutions to this?

Upvotes: 0

Views: 1166

Answers (2)

Nico Jordaan
Nico Jordaan

Reputation: 69

You need a user delegation token. https://hadoop.apache.org/docs/stable/hadoop-azure/abfs.html#Shared_Access_Signature_.28SAS.29_Token_Provider

There are three types of SAS supported by Azure Storage:

  • User Delegation SAS: Recommended for use with ABFS Driver with HNS Enabled ADLS Gen2 accounts. It is Identity based SAS that works at blob/directory level)
  • Service SAS: Global and works at container level.
  • Account SAS: Global and works at account level.

Upvotes: 0

JayashankarGS
JayashankarGS

Reputation: 8140

You need to grant the necessary permissions for that.

Below are the available permissions while generating a SAS token at the directory level. By default, it will be only Read.

enter image description here

You need to grant Read and List permissions.

dbutils.fs.ls("abfss://<container>@<storage_acc>.dfs.core.windows.net/sample/")

Output:

[FileInfo(path='abfss://<container>@<storage_acc>.dfs.core.windows.net/sample/source1_data_preview (1).csv', name='source1_data_preview (1).csv', size=148, modificationTime=1702964218000)]
spark.read.csv("abfss://<container>@<storage_acc>.dfs.core.windows.net/sample/").head(2)

Output:

[Row(_c0='ID', _c1='Start Date (BST)', _c2='Value'), 
Row(_c0='3', _c1='2022-09-20T09:10:22.987654', _c2='200')]

To create SAS token :

Check below for give access to sub-folders and sub-files while creating SAS token.

Upvotes: 1

Related Questions