How do I add Azure Synapse IP address range to ADLS firewall?

Question

I would like a Synapse notebook read ADLS blob data outside of the managed VNet, but I am getting 403 errors. (for both Managed Identity/UPN's)

java.nio.file.AccessDeniedException: Operation failed: "This request is not authorized to perform this operation.", 403, HEAD ...

at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.checkException(AzureBlobFileSystem.java:1200) at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.getFileStatus(AzureBlobFileSystem.java:519) at org.apache.hadoop.fs.FileSystem.isDirectory(FileSystem.java:1713) at org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:47) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:377) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:332) at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:315)

The ADLS storage account has been configured to use a firewall. A third-party vendor needs a vanilla ADLS storage with no private endpoints to land HR data. It is a product. We do not want to provide anonymous access.

Current configurations:

IAM: Synapse managed service identity and developers have been granted Storage Blob Data Contributor roles.
ADLS config: Resource instances that have access to the storage account: 'Microsoft.Synapse/workspace'
Checked: Allow Azure services on the trusted services list to access this storage account.

ADLS logs show, "Azure Synapse Analytics ... blocked" Client IP Address: XXX.XXX.XXX.XXX

The Synapse managed VNet is not accessible and therefore I cannot grab the IP address. I can query the parquet files in the Synapse workspace using the Linked Services.

How can I run a Synapse notebook and query the Workday ADLS storage area?

filename =  'fact_payroll_timecard/*.parquet'
data_path = 'abfss://%s@%s.dfs.core.windows.net/%s/%s' % (raw_container_name, raw_account_name, rawpath,filename)
dfpt= spark.read.parquet(data_path)

Turning off IP address filtering briefly successfully returned data.

Adding the CallerIpAddress found in the Logs to the Network Firewall IP Address list worked as well. To get the IP address of the calling notebook, turn on Diagnostic Settings for the blob storage and run this query.

StorageBlobLogs
| where TimeGenerated  > ago(3d)

I don't think this is a long-term solution as the Caller IP address will change.

How do I add Azure Synapse IP address range to ADLS firewall?

Answers (1)

Related Questions