Reputation: 2882
When ingesting data and transforming the various layers of our data lake built on top of Azure ADLS gen2 storage account (hierarchical), I can organize files in Containers or File Shares. We currently ingest raw files into a RAW container in their native format ".csv". We then take those files and merge them into a QUERY container in compressed parquet format so that we can virtualize all the data using Polybase in SQL server.
It is my understanding that only files stored within File Shares can be accessed using the typical SMB/UNC paths. When building out a data lake such as this, should Containers within ADLS be avoided in order to gain the additional benefit of being able to access those same files via File Shares?
I did notice that files located under File shares do not appear to support metadata key/values (unless it's just not exposed through the UI). Other than that, I wonder if there are any other real differences between the two types.
Upvotes: 0
Views: 844
Reputation: 4544
Thanks to @Gaurav for sharing the knowledge in comment section.
(Posting the answer using the details provided in comment section to help other community members.)
Earlier, only the files which were stored in Azure storage File Share can be accessed using the typical SMB/UNC paths. But recently, now it is possible to mount Blob Container as well using the NFS 3.0 protocol. This Microsoft official document provides step-by-step guidance.
Limitation: You can mount a container in Blob storage only from a Linux-based Azure Virtual Machine (VM) or a Linux system that runs on-premises. There is no support for Windows and Mac OS.
Upvotes: 0