Reputation: 19250
I'm using Python 3.8 and Azure data lake Gen 2. I want to set an expiration time for a file I save on the data lake. Following this -- azure.datalake.store.core.AzureDLFileSystem class | Microsoft Docs, I tried the below
file_client = directory_client.create_file(filename)
file_client.upload_data(
data,
overwrite=True
)
ts = time.time() + 100
file_client.set_expiry(path=path, expire_time=ts)
but am getting the error
AttributeError: 'DataLakeFileClient' object has no attribute 'set_expiry'
What's the proper way to set an expiration time when creating a file on the data lake?
Upvotes: 1
Views: 1752
Reputation: 21025
The reason for your error, is that you appear to be attempting to call a method belonging to azure.datalake.store.core.AzureDLFileSystem
on an object of type DataLakeFileClient
. This is why you get the error! The method does not exist for objects of type DataLakeFileClient
.
If you wish to call the method for set_expiry, you must first create the correct kind of object.
For example in Gen1, create the object first as described here:
https://learn.microsoft.com/en-us/azure/data-lake-store/data-lake-store-data-operations-python
## Declare variables
subscriptionId = 'FILL-IN-HERE'
adlsAccountName = 'FILL-IN-HERE'
## Create a filesystem client object
adlsFileSystemClient = core.AzureDLFileSystem(adlCreds, store_name=adlsAccountName)
Using this object, you can call
adlsFileSystemClient exactly like how you have in your code example.
set_expiry(path, expiry_option, expire_time=None)
Just make sure you're trying to call methods on the correct type of object.
For Gen 2:
from azure.storage.filedatalake import DataLakeServiceClient
datalake_service_client = DataLakeServiceClient.from_connection_string(self.connection_string)
# Instantiate a FileSystemClient
file_system_client = datalake_service_client.get_file_system_client("mynewfilesystem")
For Gen2, you need to set a blob to expire as follows: https://learn.microsoft.com/en-us/azure/storage/blobs/storage-lifecycle-management-concepts?tabs=azure-portal#expire-data-based-on-age
Expire data based on age
Some data is expected to expire days or months after creation. You can configure a lifecycle management policy to expire data by deletion based on data age. The following example shows a policy that deletes all block blobs older than 365 days.
{
"rules": [
{
"name": "expirationRule",
"enabled": true,
"type": "Lifecycle",
"definition": {
"filters": {
"blobTypes": [ "blockBlob" ]
},
"actions": {
"baseBlob": {
"delete": { "daysAfterModificationGreaterThan": 365 }
}
}
}
}
]
}
Upvotes: 2