ruffen
ruffen

Reputation: 1719

Trying to open parquet in Synapse - cannot be opened because it does not exist or it is used by another process

I am trying to open a Parquet files that is generated by Stream Analytics and stored in Azure Datalake V2. I have connected datalake and Synapse successfully, but I keep getting "https://datalake.dfs.core.windows.net/eh-orca-iot-pack-parquet/packdata/3134_A1_P1/2020/12/01/test.parquet"

I have Global Administrator in the Azure Tenant the datalake and synapse is in, and I renamed the file in question to test.parquet using Azure Storage Explorer so I was sure Stream Analytics was not holding onto it.

SELECT
    TOP 100 *
FROM
    OPENROWSET(
        BULK 'https://stiotdata.dfs.core.windows.net/eh-orca-iot-pack-parquet/packdata/3134_A1_P1/2020/12/01/test.parquet',
        FORMAT='PARQUET'
    ) AS [result]

The documentation gives two options, that something else is holding the file (checked and not the case, hence rename) and access rights to my own account, but I have Global Administrator. Is there anything else to check?

Upvotes: 2

Views: 7111

Answers (1)

Arthur
Arthur

Reputation: 11

It is as the documentation behind the error says. Tested it myself, had to read twice or thrice.

You need to assign your own Azure Active Directory identity at least the role of Storage Blob Data Reader.

For me this solved the error!

https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/resources-self-help-sql-on-demand#query-fails-because-file-cannot-be-opened

Upvotes: 1

Related Questions