Reputation: 87
I am reading the data from a folder /mnt/lake/customer where mnt/lake is the mount path referring to ADLS Gen 2, Now I would like to rename a folder from /mnt/lake/customer to /mnt/lake/customeraddress without copying the data from one folder to another folder.
I don't want to use the move copy(mv) command as it takes a lot of time to copy the data and I have almost thousands of folders that need to be renamed and the data volume is so huge.
I want to do this through data bricks does anybody has an idea?
Upvotes: 1
Views: 2624
Reputation: 87069
Updated answer:
Unfortunately, right now dbutils.fs.mv
is implemented as copy + remove of original file, so it couldn't be used. The alternative could be to use ADLS Python SDK, that has the rename_directory
method to perform that task, something like this:
%pip install azure-storage-file-datalake azure-identity
from azure.storage.filedatalake import DataLakeServiceClient
from azure.identity import ClientSecretCredential
tenant_id = "...."
client_secret = dbutils.secrets.get("scope", "client_secret")
client_id = "...."
credential = ClientSecretCredential(tenant_id, client_id, client_secret)
service_client = DataLakeServiceClient(
account_url="https://<storage_acc>.dfs.core.windows.net",
credential=credential)
file_system_client = service_client.get_file_system_client(
file_system="<container>")
directory_client = file_system_client.get_directory_client("<source_dir>")
new_dir_name = "abc2"
directory_client.rename_directory(
new_name=directory_client.file_system_name + '/' + new_dir_name)
Original answer, before correction: Mount is just an entry in some internal database that maps the name to the actual location of the data. If you want to rename mount point, just unmount it with dbutils.fs.unmount("/mnt/mount-name") and mount it again with dbutils.fs.mount using the new name (you need to have credentials for service principal):
dbutils.fs.unmount("/mnt/lake/customer")
configs = {"fs.azure.account.auth.type": "OAuth",
"fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
"fs.azure.account.oauth2.client.id": "<application-id>",
"fs.azure.account.oauth2.client.secret": dbutils.secrets.get(scope="<scope-name>",key="<service-credential-key-name>"),
"fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/<directory-id>/oauth2/token"}
# Optionally, you can add <directory-name> to the source URI of your mount point.
dbutils.fs.mount(
source = "abfss://<container-name>@<storage-account-name>.dfs.core.windows.net/",
mount_point = "/mnt/lake/customeraddress",
extra_configs = configs)
Upvotes: 1