ben0
ben0

Reputation: 23

Reading from Cloud Storage, Zipping to BlobStore

I'm trying to read 5 basic text files from my cloud storage bucket, zip them, and write to BlobStore.

from google.appengine.api import files
from google.appengine.ext import blobstore
from google.appengine.ext.webapp import blobstore_handlers
import StringIO
import zipfile

class FactoryHandler(blobstore_handlers.BlobstoreDownloadHandler):
   def get(self):
       """ SERVE THE BLOB, IF KEY AVAILABLE """
       k = self.request.get('key')
       if k:
           self.send_blob(k)
           return

       """ TAKES FROM CLOUD STORAGE , ZIPS IT """
       zipstream = StringIO.StringIO()
       zfile = zipfile.ZipFile(file=zipstream, mode='w')
       objects = files.listdir('/gs/test-bucket')

       for o in objects:
           with files.open(o, 'r') as f:
               data = f.read(1)

               while data != "":
                   zfile.writestr(o.encode('utf-8'),data)
                   data = f.read(1)

       zfile.close()
       zipstream.seek()

       """ NOW, ADD ZIP TO BLOBSTORE """
       zip_file = files.blobstore.create(mime_type='application/zip',_blobinfo_uploaded_filename='test.zip')
       zip_data = zipstream.getvalue()
       with files.open(zip_file, 'a') as f:
           f.write(zip_data)

       files.finalize(zip_file)
       blob_key = files.blobstore.get_blob_key(zip_file)
       self.response.out.write(blob_key)

Somehow I always end up with only the last character from each text file. I suspect it's because I'm calling f.read(1), but the code is basically iterating through each byte, then writing it to the zfile object.

I tried concatenating the data:

for o in objects:
    with files.open(o, 'r') as f:
        data = f.read(1)

        while data != "":
            data += f.read(1)

        """ once completed, write """
        zfile.writestr(o.encode('utf-8'),data)

but the App Engine dev server hangs. Possibly because we can't concatenate data.

Any solutions, and does this work for non-text files? (images, mp3s, etc)

EDIT:

so i ran the answer on my production app + Google Storage account.

got this error

ApplicationError: 8 
Traceback (most recent call last):
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/_webapp25.py", line       
710, in __call__
handler.get(*groups)
File "/base/data/home/apps/s~app-name/v1-55-app- 
proto.363439297871242967/factory.py", line 98, in get
with files.open(o, 'r') as f:
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/files/file.py", line 520, 
in open
exclusive_lock=exclusive_lock)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/files/file.py", line 276,  
in __init__
self._open()

File "/base/python_runtime/python_lib/versions/1/google/appengine/api/files/file.py", line 423,    
in _open
self._make_rpc_call_with_retry('Open', request, response)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/files/file.py", line 427,   
in _make_rpc_call_with_retry
_make_call(method, request, response)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/files/file.py", line 252, 
in _make_call
_raise_app_error(e)

File "/base/python_runtime/python_lib/versions/1/google/appengine/api/files/file.py", line 210,  
in _raise_app_error
raise PermissionDeniedError(e)

PermissionDeniedError: ApplicationError: 8 

my ACL settings were correct ( i previously got the ACL Access Denied Error and fixed it )

ACL settings added to the original settings, bucket-specific

   <Entry>
        <Scope type="UserByEmail">
            <EmailAddress>
                [email protected]
            </EmailAddress>
        </Scope>
        <Permission>
            FULL_CONTROL
        </Permission>
    </Entry>

any hints? According to the docs from https://developers.google.com/appengine/docs/python/googlestorage/exceptions

exception PermissionDeniedError()
The application does not have permission to perform this operation.

UPDATE - i noticed that when i set the files to public_read, i'm able to read them via my app. That means my app is somehow not properly configured to access in private mode. Any hints? The only way to fix it (that i know of) is via ACL, and i've already got that part configured.

Upvotes: 2

Views: 1211

Answers (1)

Eric Olson
Eric Olson

Reputation: 511

ben0,

The function ZipFile.writestr() writes an entire file to the zipfile. You need to read all a file's data, and then call writestr() once per file.

Your second code block is on the right track, but the while check needs to be updated to avoid an infinite loop. "data" will never be empty with this code, so a small change is needed to check the last chunk that was read. Something like this should work:

for o in objects:
    with files.open(o, 'r') as f:
        data_list = []

        chunk = f.read(1000)

        while chunk != "":
            data_list.append(chunk)
            chunk = f.read(1000)

        data = "".join(data_list)

        """ once completed, write """
        zfile.writestr(o.encode('utf-8'),data)

Also, reading larger chunks than 1 byte could be a bit faster, but since you're using small text files, it shouldn't matter much.

Upvotes: 2

Related Questions