Naaruto Uchiha
Naaruto Uchiha

Reputation: 83

Azure Synapse connection to private container in Blob store is not working

I am following the steps from this guide to connect to my Blob store which is General purpose V2 (Not ADL Gen 2): https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/microsoft-spark-utilities?pivots=programming-language-python

My linked service to blob storage account was using managed identity and it works just fine for public access container for reading files with this syntax

blob_sas_token = mssparkutils.credentials.getConnectionStringOrCreds(linked_service_name)
wasb_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)

spark.conf.set('fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name), blob_sas_token)
df = spark.read.option('multiline','true').json(wasb_path)

But when I try to connect to a private access container using similar method, it gives me Path does not exist error.

I cannot change the access level of container since it is production system.

I tried to create a linkedserver with SAS key at account level and usedsame syntax just to see if it works, but I get

Py4JJavaError: An error occurred while calling z:mssparkutils.credentials.getConnectionStringOrCreds.
: java.lang.Exception: Access token couldn't be obtained {"result":"DependencyError","errorId":"BadRequest","errorMessage":"LSRServiceException is [{\"StatusCode\":400,\"ErrorResponse\":{\"code\":\"SasTokenNull\",\"message\":\"SAS token is null

Upvotes: 1

Views: 2990

Answers (1)

Rakesh Govindula
Rakesh Govindula

Reputation: 11454

Please recheck the path of your blob file and roles for the managed identity. If you still face the same, you can use the linked service for blob storage with account key authentication as a workaround.

These are my private containers. enter image description here

CSV file inside the container3

enter image description here

Here, when creating the linked service for the Blob storage select the Account key as the Authentication type and give your subscription and storage account details.

enter image description here

Code:

from pyspark.sql import SparkSession

blob_account_name = 'storage account name'
blob_container_name = 'container name' 
blob_relative_path = 'blob folder path' 
linked_service_name ='linked service name' 

blob_sas_token = mssparkutils.credentials.getConnectionStringOrCreds(linked_service_name)
wasb_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set('fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name), blob_sas_token)
df=spark.read.csv(wasb_path)
df.show()

Here I am using the same code from the Official documentation that you provided.

Output: enter image description here

Upvotes: 1

Related Questions