List all file names located in an Azure Blob Storage

Question

I want to list in Databricks all file names located in an Azure Blob Storage.
My Azure Blob Storage is structured like this:

aaa

------bbb

------------bbb1.xml

------------bbb2.xml

------ccc

------------ccc1.xml

------------ccc2.xml

------------ccc3.xml

If I do:

dbutils.fs.ls('wasbs://xxx@xxx.blob.core.windows.net/aaa')

only subfolders bbb and ccc are listed like this:

[FileInfo(path='wasbs://xxx@xxx.blob.core.windows.net/aaa/bbb/', name='bbb/', size=0), 
 FileInfo(path='wasbs://xxx@xxx.blob.core.windows.net/aaa/ccc/', name='ccc/', size=0)]

I want to deepen to the last subfolder to see all file names located in aaa: bbb1.xml, bbb2.xml, ccc1.xml, ccc2.xml and ccc3.xml.

If I do:

dbutils.fs.ls('wasbs://xxx@xxx.blob.core.windows.net/aaa/*')

an error occurs because the path can not be parameterized.

Any idea to do this in Databricks?

List all file names located in an Azure Blob Storage

Answers (1)

Related Questions