Reputation: 1783
I'm trying to simply append a CSV file to a container on Azure Datalake, and I have the following class to do it:
from azure.storage.filedatalake import DataLakeServiceClient
from datetime import datetime
class AzureHandler:
###
# CONSTRUCTOR AzureHandler
###
def __init__(self, storage_account, storage_key):
connect_string = "DefaultEndpointsProtocol=https;AccountName=" + \
storage_account + ";AccountKey=" + storage_key + \
";EndpointSuffix=core.windows.net"
self.datalake_service_client = DataLakeServiceClient.from_connection_string(
conn_str=connect_string)
def write_tag_csv_file(self, container, folder_name, file_name, data):
filename_to_write = datetime.today().strftime('%Y%m%d')+'_'+file_name
file_system_client = self.datalake_service_client.get_file_system_client(
container)
directory_client = file_system_client.get_directory_client(folder_name)
try:
file_client = directory_client.get_file_client(filename_to_write)
file_client.get_file_properties().size
filesize_previous = file_client.get_file_properties().size
file_client.append_data(
data, offset=filesize_previous, length=len(data))
file_client.flush_data(filesize_previous+len(data))
except:
file_client = directory_client.create_file(file_name)
filesize_previous = 0
file_client.append_data(
data, offset=filesize_previous, length=len(data))
file_client.flush_data(filesize_previous+len(data))
However, whenever I can AzureHandler.write_tag_csv_file I get a few of the following errors:
Traceback (most recent call last):
File "C:\python39\lib\site-packages\azure\storage\filedatalake\_data_lake_file_client.py", line 450, in append_data
return self._client.path.append_data(**options)
File "C:\python39\lib\site-packages\azure\storage\filedatalake\_generated\operations\_path_operations.py", line 1617, in append_data
raise HttpResponseError(response=response, model=error)
azure.core.exceptions.HttpResponseError: (InvalidHeaderValue) The value for one of the HTTP headers is not in the correct format.
I tried to read up a bit and saw it might be an API version issue, so I tried all available API versions by specifying them where I declare my datalake_service_client, but all of them give me the same error. How can I fix this?
Upvotes: 0
Views: 1244
Reputation: 30005
Your code seems correct, I can run it without any error, the append operation is working well for .csv
file.
Please try to install the latest version of ADLS Gen2 package 12.2.3
here: azure-storage-file-datalake 12.2.3. You can use this command to install it: pip install azure-storage-file-datalake==12.2.3
, and no need to specifying the api_version
where you declare my datalake_service_client.
Please let me know if you still have the issue. And also provide the detailed code how do you call this method.
Upvotes: 1