Axel R.
Axel R.

Reputation: 1300

Can't access mounted volume with python on Databricks

I am trying to give access to an Azure Storage Account Gen2 container to a team in their Databricks workspace by mounting it to a the dbfs, using Credential Passthrough. I want to be able to manage the access with the Active Directory, since eventually, there are containers to be mounted in readonly.

I based my code on this tutorial : https://learn.microsoft.com/en-us/azure/databricks/data/data-sources/azure/adls-passthrough#adls-aad-credentials

Extract from my conf :

"spark_conf": {
        "spark.databricks.cluster.profile": "serverless",
        "spark.databricks.passthrough.enabled": "true",
        "spark.databricks.delta.preview.enabled": "true",
        "spark.databricks.pyspark.enableProcessIsolation": "true",
        "spark.databricks.repl.allowedLanguages": "python,sql"
    }

I then run the following code :

dbutils.fs.mount(
  source = f"wasbs://data@storage_account_name.blob.core.windows.net",
  mount_point = "/mnt/data/",
  extra_configs = {
  "fs.azure.account.auth.type":"CustomAccessToken",
  "fs.azure.account.custom.token.provider.class":spark.conf.get("spark.databricks.passthrough.adls.gen2.tokenProviderClassName")
}

That is a success as I can access the volume with dbutils.

>> dbutils.fs.ls('dbfs:/mnt/storage_account_name/data')
[FileInfo(path='dbfs:/mnt/storage_account_name/data/folder/', name='folder/', size=0)]

My issue is when I run either %sh ls /dbfs/mnt/storage_account_name/data or try to access it with python

>> import os 
>> os.listdir('/dbfs/')
Out[1]: []

>> os.listdir('/dbfs/mnt/')
FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/mnt/'

I can't find out what I am missing. Is there something to configure to make it accessible to python ? Thanks.

Upvotes: 1

Views: 4361

Answers (2)

Vivek Atal
Vivek Atal

Reputation: 533

There are certain limitations when you are using credential passthrough option, which is why it did not work. There is no issue of syntax. See this offical doc to understand.

Upvotes: 3

Axel R.
Axel R.

Reputation: 1300

Answer is plain simple.

Local file API Limitations

The following list enumerates the limitations in local file API usage that apply to each Databricks Runtime version.

All - Does not support credential passthrough.

Source : https://learn.microsoft.com/en-us/azure/databricks/data/databricks-file-system#local-file-apis

Upvotes: 0

Related Questions