PJT
PJT

Reputation: 185

Databricks - Download a dbfs:/FileStore file to my Local Machine

Normally I use below URL to download file from Databricks DBFS FileStore to my local computer.

*https://<MY_DATABRICKS_INSTANCE_NAME>/fileStore/?o=<NUMBER_FROM_ORIGINAL_URL>*

However, this time the file is not downloaded and the URL lead me to Databricks homepage instead. Does anyone have any suggestion on how I can download file from DBFS to local area? or how should fix the URL to make it work?

Any suggestions would be greatly appreciated!

PJ

Upvotes: 12

Views: 34087

Answers (1)

CHEEKATLAPRADEEP
CHEEKATLAPRADEEP

Reputation: 12788

Method1: Using Databricks portal GUI, you can download full results (max 1 millions rows).

enter image description here

Method2: Using Databricks CLI

To download full results, first save the file to dbfs and then copy the file to local machine using Databricks cli as follows.

dbfs cp "dbfs:/FileStore/tables/my_my.csv" "A:\AzureAnalytics"

You can access DBFS objects using the DBFS CLI, DBFS API, Databricks file system utilities (dbutils.fs), Spark APIs, and local file APIs.

In a Spark cluster you access DBFS objects using Databricks file system utilities, Spark APIs, or local file APIs.

On a local computer you access DBFS objects using the Databricks CLI or DBFS API.

Reference: Azure Databricks – Access DBFS

The DBFS command-line interface (CLI) uses the DBFS API to expose an easy to use command-line interface to DBFS. Using this client, you can interact with DBFS using commands similar to those you use on a Unix command line. For example:

# List files in DBFS
dbfs ls
# Put local file ./apple.txt to dbfs:/apple.txt
dbfs cp ./apple.txt dbfs:/apple.txt
# Get dbfs:/apple.txt and save to local file ./apple.txt
dbfs cp dbfs:/apple.txt ./apple.txt
# Recursively put local dir ./banana to dbfs:/banana
dbfs cp -r ./banana dbfs:/banana

Reference: Installing and configuring Azure Databricks CLI

Method3: Using third-party tool named DBFS Explorer

DBFS Explorer was created as a quick way to upload and download files to the Databricks filesystem (DBFS). This will work with both AWS and Azure instances of Databricks. You will need to create a bearer token in the web interface in order to connect.

enter image description here

Upvotes: 18

Related Questions