Reputation: 129
With Azure Synapse I have a notebook where I read excel files located in Azure Data Lake.
Usually it worked, what changed is that now I use managed private endpoint.
When notebook is launched I have the error:
ServiceRequestError: Cannot connect to host adlsdev.blob.core.windows.net:443 ssl:True [Name or service not known]
I tried the following (worked before):
pd.read_excel('abfss://[email protected]/test/test1.xlsx')
What can I do in Synapse ?
Upvotes: 0
Views: 485
Reputation: 8402
To read excel files in Synapse (managed private endpoint) you need to use linked service.
%%pyspark
# Set the required configs
source_full_storage_account_name = "Storage_account_name.dfs.core.windows.net"
spark.conf.set(f"spark.storage.synapse.{source_full_storage_account_name}.linkedServiceName", "Linked_ervice_name")
sc._jsc.hadoopConfiguration().set(f"fs.azure.account.auth.type.{source_full_storage_account_name}", "SAS_Token")
sc._jsc.hadoopConfiguration().set(f"fs.azure.sas.token.provider.type.{source_full_storage_account_name}", "com.microsoft.azure.synapse.tokenlibrary.LinkedServiceBasedSASProvider")
import pandas as pd
# Python code
df = pd.read_excel('abfss://Container_name@Storage_account_name.dfs.core.windows.net/file.xlsx')
print(df)
OUTPUT:
Upvotes: 0