Kenny_I
Kenny_I

Reputation: 2513

How to configure Azure Storage Gen 2 for Azure Databricks

I'm trying to mount data lake with Databricks. My goal is to build data lake. I wonder why format of my url is different from documentation.What is meaning of filesystem and dfs?

I tried to create data lake with Azure Storage Gen2. Enabled hierarchy and started to create directories. I noticed that file url includes word "blob".

This is my url currently: https://datalakestagingtest.blob.core.windows.net/staging/manufacturers/nissan/micra.csv

I see that format is different in DataLake documentation where url may be abfss://@.dfs.core.windows.net/

Reference: https://docs.databricks.com/data/data-sources/azure/azure-datalake-gen2.html

Upvotes: 2

Views: 4775

Answers (1)

CHEEKATLAPRADEEP
CHEEKATLAPRADEEP

Reputation: 12788

A couple of important points to note while mounting Storage accounts in Azure Databricks.

For Azure Blob storage: source = "wasbs://<container-name>@<storage-account-name>.blob.core.windows.net/<directory-name>"

For Azure Data Lake Storage gen2: source = "abfss://<file-system-name>@<storage-account-name>.dfs.core.windows.net/"

To mount an Azure Data Lake Storage Gen2 filesystem or a folder inside it as Azure Databricks file system, the URL should be like abfss://<file-system-name>@<storage-account-name>.dfs.core.windows.net/

enter image description here

Reference: Azure Databricks - Azure Data Lake Storage Gen2

Upvotes: 3

Related Questions