Tim
Tim

Reputation: 1471

Azure pythonsdk to fetch list of blobs in ADLS Gen2

I am fetching the list of blob store using python sdk from Classic Azure Blob Store

from azure.storage.blob import BlobServiceClient

...
def __init__ (self, key, blob_accnt_name):
    self._blob_acct_url = f'https://{blob_accnt_name}.blob.core.windows.net'
    self.blob_service_client = BlobServiceClient(account_url=self._blob_acct_url, credential=key)
    self.container_client = self.blob_service_client.get_container_client(azure_blob_storage.get_container())
    
def fetchBlob(self):
    blob_items = container_client.list_blobs(name_starts_with='{}'.format(blob_path)) ############# (1)
    blob_paths =[]
    if any(True for _ in blob_items):
           blob_paths.append(....
...

For the ADLS Gen2, i am trying to fetch using the file_system_client, get_paths

import os, uuid, sys
from azure.storage.filedatalake import DataLakeServiceClient
from azure.core._match_conditions import MatchConditions
from azure.storage.filedatalake._models import ContentSettings

# function to initialize the ADLS  account
def initStorage(storage_account_name, storage_account_key):
    try:  
        global service_client
        service_client = DataLakeServiceClient(account_url="{}://{}.dfs.core.windows.net".format(
            "https", storage_account_name), credential=storage_account_key)
    except Exception as e:
        print(e)

# Function to list the contents of the directory
def listDirectoryContents():
    try:
        file_system_client = service_client.get_file_system_client(file_system="<container-name>") # system name is container name in ADLS g2
        paths = file_system_client.get_paths(path="<directory-name>")  ################ (2)

        for path in paths:
            print(path.name + '\n')

    except Exception as e:
       print(e)

Is container_client.list_blobs and the file_system_client.get_paths are the same usage? (refer the code snippet lines marked as - #######(1) and #########(2))

Upvotes: 2

Views: 661

Answers (1)

Mohit Ganorkar
Mohit Ganorkar

Reputation: 2069

  • Well, the function used in both the cases i.e., list_blobs and get_paths will give a generator which will lazily follow the tokens.

  • But the generator itself contain different things in case of list_blobs as the name suggest it will contain blobs which are inside the container while for get_paths it will be paths for files in the file system.

  • The classes are also different in the sense that they are applicable to containers and file system respectively but theoretically they do same thing i.e., they let us interact with entities which may not yet exist.

container client class

file system client class

Upvotes: 2

Related Questions