Reputation: 75
I have secure storage account in Azure which I am trying to access using Microsoft Fabric Dataflow Gen2. I managed to connect to the storage account using a vnet data gateway.
Getting the data results in a query that has the following columns:
Content | Name | Extension | Date accessed | Date modified | Date created | Attributes | Folder path |
---|---|---|---|---|---|---|---|
[Binary] | Iris.parquet | .parquet | null | 5/29/2024, 8:04:02 AM | null | [Record] | https://mystorageaccount.dfs.core.windows.net/mycontainer |
[Binary] | MT cars.parquet | .parquet | null | 5/29/2024, 8:04:02 AM | null | [Record] | https://mystorageaccount.dfs.core.windows.net/mycontainer |
[Binary] | Titanic.parquet | .parquet | null | 5/29/2024, 8:04:02 AM | null | [Record] | https://mystorageaccount.dfs.core.windows.net/mycontainer |
Now imagine I have 100 parquet files, I do not want to do a navigation and convert from parquet for each row I have. I want to load them all into a lakehouse as bronze layer. How can I turn this query into all the different tables (in the example, 3 different tables) and load them into a lakehouse?
Upvotes: 0
Views: 187
Reputation: 89361
Instead of using the VNET Data Gateway, configure Trusted Workspace Access for your Storage Account, and simply create shortcut to the storage account in the Files section of your Fabric Lakehouse.
Alternatively, use a Fabric or ADF Pipeline to copy the files in binary mode into OneLake.
Or just use azcopy to read from ALDS and write to OneLake.
In short Dataflows Gen 2 is the wrong tool for this task.
Upvotes: 0