Reputation: 467
I am trying to unzip a .gz file stored in azure data lake.
from azure.datalake.store import core, lib
Tenant_Id = '####'
Client_Key = '####'
Client_Id = '####'
token = lib.auth(tenant_id=Tenant_Id, client_secret=Client_Key, client_id=Client_Id)
store_name = 'root'
# Connecting to adl
adl = core.AzureDLFileSystem(token, store_name=store_name)
# List of .gz files
list_of_gz_files = adl.ls('/test/2018')
# Would like to uzip files present inside list_of_gz_files list
Is it possible to unzip them using gzip etc?
Upvotes: 1
Views: 1651
Reputation: 23767
Provide 3 options here to decompress zip files in the ADL.
1.Use Azure Data Factory to unzip the files using the copy file activity (native support for gzip files).
2.Use Custom Activity in ADF. Create job in azure batch and access data lake to unzip the file with python code.(Use gzip package)
3.Use custom extractor in U-SQL,please refer to this trace:How to preprocess and decompress .gz file on Azure Data Lake store?
Upvotes: 1