Reputation: 61
I am attempting to give access to parquet files on a Gen2 Data Lake container. I have owner RBAC on the container but would prefer to limit access in the container for other users.
My Query is very simple:
SELECT
TOP 100 *
FROM
OPENROWSET(
BULK 'https://aztsworddataaipocacldl.dfs.core.windows.net/pocacl/Top/Sub/part-00006-c62926ba-c530-4ad8-87d1-cf38c67a2da3-c000.snappy.parquet',
FORMAT='PARQUET'
) AS [result]
When I run this I have no problems connecting. I have attempted to add ACL rights onto the files (and of course the containing folders 'Top' and 'Sub').
When my colleague attempts to run the SQL script the get the error message. Failed to execute query. Error: File 'https://aztsworddataaipocacldl.dfs.core.windows.net/pocacl/Top/Sub/part-00006-c62926ba-c530-4ad8-87d1-cf38c67a2da3-c000.snappy.parquet' cannot be opened because it does not exist or it is used by another process.
NB similar results are also experienced in Spark but with a 403 instead
SQL on-demand provides a link to the following help file after the error, it suggests:
If your query fails with the error saying 'File cannot be opened because it does not exist or it is used by another process' and you're sure both file exist and it's not used by another process it means SQL on-demand can't access the file. This problem usually happens because your Azure Active Directory identity doesn't have rights to access the file. By default, SQL on-demand is trying to access the file using your Azure Active Directory identity. To resolve this issue, you need to have proper rights to access the file. Easiest way is to grant yourself 'Storage Blob Data Contributor' role on the storage account you're trying to query.
I don't wish to grant Storage Blob Data Contributor or Storage Blob Data Reader as this gives access to every file on the container and not just those I want end users to be able to query. We have found the same experience occurs for SSMS connecting to parquet external tables.
So then in parts:
*I have resisted pasting my entire settings as I really have no idea what is relevant and what entirely irrelevant to this issue but of course can supply.
Upvotes: 2
Views: 4844
Reputation: 1
Go to azure storage explorer change ACL permissions in the route Folder and right click on your storage and click on "propogate access control lists"
Upvotes: 0
Reputation: 61
It would appear that the ACL feature was not working correctly in Preview for Azure Synapse Analytics.
I have now managed to get it to work. At present I see that once Read|Execute is provided to a folder it allows access to the files contained within that folder and sub folders. Access is available even when no specific ACL access is provided on a file in a sub folder. This is not quite what I expected however it provides enough for me to proceed: only giving access to the Gold folder allows for separation of access to the files I want to let users query and the working files that I want to keep hidden.
Upvotes: 2
Reputation: 14610
When you assign ACL to a folder it's not propagated recursively to all files inside the folder. Only new files inherit from the folder.
You can see this here
Upvotes: 2