Reputation: 33
I have a cloud function which is triggered when a zip is uploaded to cloud storage and is supposed to unpack it. However the function runs out of memory, presumably since the unzipped file is too large (~2.2 Gb). I was wondering what my options are for dealing with this problem? I read that it's possible to stream large files into cloud storage but I don't know how to do this from a cloud function or while unzipping. Any help would be appreciated.
Here is the code of the cloud function so far:
storage_client = storage.Client()
bucket = storage_client.get_bucket("bucket-name")
destination_blob_filename = "large_file.zip"
blob = bucket.blob(destination_blob_filename)
zipbytes = io.BytesIO(blob.download_as_string())
if is_zipfile(zipbytes):
with ZipFile(zipbytes, 'r') as myzip:
for contentfilename in myzip.namelist():
contentfile = myzip.read(contentfilename)
blob = bucket.blob(contentfilename)
blob.upload_from_string(contentfile)
Upvotes: 2
Views: 4192
Reputation: 75725
Your target process is risky:
Thus, you have 2 successful operation without checksum validation!
Before having Cloud Function or Cloud Run with more memory, you can use Dataflow template to unzip your files
Upvotes: 4