Reputation: 597
I worked through the example code from the Azure docs https://learn.microsoft.com/en-us/azure/storage/blobs/storage-quickstart-blobs-python
from azure.storage.blob import BlockBlobService
account_name = "x"
account_key = "x"
top_level_container_name = "top_container"
blob_service = BlockBlobService(account_name, account_key)
print("\nList blobs in the container")
generator = blob_service.list_blobs(top_level_container_name)
for blob in generator:
print("\t Blob name: " + blob.name)
Now I would like to know how to get more fine grained in my container walking. My container top_level_container_name has several subdirectories
I would like to be able to list all of the blobs that are inside just one of those directories. For instance
How do I get a generator of just the contents of dir1 without having to walk all of the other dirs? (I would also take a list or dictionary)
I tried adding /dir1 to the name of the top_level_container_name so it would be top_level_container_name = "top_container/dir1"
but that didn't work. I get back an error code azure.common.AzureHttpError: The requested URI does not represent any resource on the server. ErrorCode: InvalidUri
The docs do not seem to even have any info on BlockBlobService.list_blobs() https://learn.microsoft.com/en-us/python/api/azure.storage.blob.blockblobservice.blockblobservice?view=azure-python
Update: list_blobs() comes from https://github.com/Azure/azure-storage-python/blob/ff51954d1b9d11cd7ecd19143c1c0652ef1239cb/azure-storage-blob/azure/storage/blob/baseblobservice.py#L1202
Upvotes: 27
Views: 71021
Reputation: 328
the parameter is name_starts_with. the code will look like this: container.list_blobs(name_starts_with=prefix_value)
prefix="dir1/" inside the container.
please check the documentation https://learn.microsoft.com/en-us/python/api/azure-storage-blob/azure.storage.blob.containerclient?view=azure-python#azure-storage-blob-containerclient-list-blobs
Upvotes: 0
Reputation: 538
To get the blob files inside dir or subdirectory as filepath
from azure.storage.blob import BlockBlobService
blob_service = BlockBlobService(account_name, account_key)
blobfile = []
generator = blob_service.list_blobs(container_name, prefix="filepath/", delimiter="")
for blob in generator:
blobname = blob.name.split('/')[-1]
blobfile.append(blobname)
print("\t Blob name: " + blob.name)
print(blobfile)
Replace delimiter="/" to get the blob as a folder in the above code
Upvotes: 3
Reputation: 42886
The module azurebatchload
provides for this and more. You can filter on folder or filenames, plus choose to get the the result in various formats:
from azurebatchload import Utils
list_blobs = Utils(container='containername').list_blobs()
from azurebatchload import Utils
df_blobs = Utils(
container='containername',
dataframe=True
).list_blobs()
from azurebatchload import Utils
list_blobs = Utils(
container='containername',
name_starts_with="foldername/"
).list_blobs()
from azurebatchload import Utils
dict_blobs = Utils(
container='containername',
name_starts_with="foldername/",
extended_info=True
).list_blobs()
from azurebatchload import Utils
df_blobs = Utils(
container='containername',
name_starts_with="foldername/",
extended_info=True,
dataframe=True
).list_blobs()
disclaimer: I am the author of the azurebatchload module.
Upvotes: 8
Reputation: 481
Not able to import BlockBlobService. Seems like BlobServiceClient is the new alternative. Followed the official doc and found this:
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
Create a Blob Storage Account client
connect_str = <connectionstring>
blob_service_client = BlobServiceClient.from_connection_string(connect_str)
Create a container client
container_name="dummy"
container_client=blob_service_client.get_container_client(container_name)
This will list all blobs in the container inside dir1 folder/directory
blob_list = container_client.list_blobs(name_starts_with="dir1/")
for blob in blob_list:
print("\t" + blob.name)
Upvotes: 35
Reputation: 136136
Please try something like:
generator = blob_service.list_blobs(top_level_container_name, prefix="dir1/")
This should list blobs and folders in dir1
virtual directory.
If you want to list all blobs inside dir1
virtual directory, please try something like:
generator = blob_service.list_blobs(top_level_container_name, prefix="dir1/", delimiter="")
For more information, please see this link
.
Upvotes: 41