LJRB
LJRB

Reputation: 129

How to read excel files in Synapse (managed private endpoint)?

With Azure Synapse I have a notebook where I read excel files located in Azure Data Lake.

Usually it worked, what changed is that now I use managed private endpoint.

When notebook is launched I have the error:


ServiceRequestError: Cannot connect to host adlsdev.blob.core.windows.net:443 ssl:True [Name or service not known]

I tried the following (worked before):

pd.read_excel('abfss://[email protected]/test/test1.xlsx')

What can I do in Synapse ?

Upvotes: 0

Views: 485

Answers (1)

Pratik Lad
Pratik Lad

Reputation: 8402

To read excel files in Synapse (managed private endpoint) you need to use linked service.

  • Create linked service on the storage account (ADLS for dfs.core.windows.net/BLOB for blob.core.windows.net endpoint According to your requirement) with (managed private endpoint) enabled integration runtime

enter image description here

  • Approve managed private endpoint on particular storage account as shown below

enter image description here

  • Then execute below code from your synapse notebook from this Microsoft doc for reading file from ADLS with linked service.
%%pyspark
# Set the required configs
source_full_storage_account_name = "Storage_account_name.dfs.core.windows.net"
spark.conf.set(f"spark.storage.synapse.{source_full_storage_account_name}.linkedServiceName", "Linked_ervice_name")
sc._jsc.hadoopConfiguration().set(f"fs.azure.account.auth.type.{source_full_storage_account_name}", "SAS_Token")
sc._jsc.hadoopConfiguration().set(f"fs.azure.sas.token.provider.type.{source_full_storage_account_name}", "com.microsoft.azure.synapse.tokenlibrary.LinkedServiceBasedSASProvider")

import pandas as pd
# Python code
df = pd.read_excel('abfss://Container_name@Storage_account_name.dfs.core.windows.net/file.xlsx')
print(df)

OUTPUT:

enter image description here

Upvotes: 0

Related Questions