abhishek
abhishek

Reputation: 1

Azure Synapse Error - File cannot be opened because it does not exist or it is used by another process. Using ACLs to authorize the access

enter image description hereNeed your help to find where I am exactly wrong.

Problem Statement - Run the following SQL in Azure Synapse: `

select top 10 * from 
OPENROWSET(
    BULK 'https://myxyzaccount.blob.core.windows.net/data/covid_12_31_2021.csv',
    format = 'csv',
    PARSER_VERSION = '2.0',
    FIRSTROW = 2) as rows

`

Error Encountered - File 'https://myxyzaccount.blob.core.windows.net/data/covid_12_31_2021.csv' cannot be opened because it does not exist or it is used by another process.

Please note -

  1. The blob (csv file) actually exists in the path. I don't have permission to assign roles using IAM in the storage account. Hence I cannot assign 'Blob Contributor or Reader' role (although I was able to create the Storage account all by myself).

  2. I checked the documentation https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/resources-self-help-sql-on-demand?tabs=x80070002#query-fails-because-file-cannot-be-opened. Please see the section Alternative to Storage Blob Data Contributor role

My understanding is, even if one is unable to grant authorization via storage account IAM, one can still let synapse access the required file using proper ACL assigment.

  1. My blob path is myxyzaccount/data/covid_12_31_2021.csv I added the MSI of Azure Synapse in the Container Root directory (data) and gave 'Execute' permission. Then I provided Read & Execute (as well as write) permission at the file (covid_12_31_2021.csv) level.

Yet, I am getting the same error, I pasted above. For your reference I am attaching screenshots of the ACL assignments at the container (root) and file level.

Can you please help what exactly I am doing wrong?

Thanks!!

Upvotes: 0

Views: 4822

Answers (1)

Utkarsh Pal
Utkarsh Pal

Reputation: 4544

In Synapse if the query fails with the error - File cannot be opened because it does not exist or it is used by another process but the file actually exists, it means serverless SQL pool can't access the file. This problem usually happens because the Azure Active Directory identity doesn't have rights to access the file. By default, serverless SQL pool is trying to access the file using your Azure Active Directory identity. To resolve this issue, you need to have proper rights to access the file. First grant yourself 'Storage Blob Data Contributor' role on the storage account you're trying to query.

Refer Microsoft’s official document: Assign Azure roles using the Azure portal

I tried to repro the same and it is working fine for me.

  1. Make sure your Blob/ADLS storage account is linked under Data Tab. If it is not Linked, click on + symbol in above shown image to link.

enter image description here

  1. On the file name in container, click on New SQL Query and select Select TOP 100 Rows.

enter image description here

  1. You will get auto generated code as shown below.
-- This is auto-generated code
SELECT
    TOP 100 *
FROM
    OPENROWSET(
        BULK 'https://<storage_name>.dfs.core.windows.net/sink/business (1).csv',
        FORMAT = 'CSV',
            PARSER_VERSION = '2.0'
    ) AS [result]

This will work fine now and you will get the expected result. Refer below image.

enter image description here

Upvotes: -1

Related Questions