Tom Lous
Tom Lous

Reputation: 2899

Unzipping uploaded file, using zip_read, in Google App Engine (GAE)

Uploading files to a PHP GAE environment has to be done by CloudStorageTools::createUploadUrl() as far as I know for user uploaded files, resulting in a file in a gs bucket (gs://[name]/[id]).

File upload works as a charm, but unzipping the uploaded file poses some problems. I've tried 3 approaches, but none seem to work for me:

  1. PHP zip functions (http://www.php.net/manual/en/ref.zip.php) are supported, but using zip_open on a gs bucket path doesn't work (to check the fopen command resulted in a working file pointer: Resource id #120)

  2. ZipArchive (https://www.php.net/manual/en/book.zip.php). Unfortunately the ZipArchive library is not (yet?) supported on GAE. Has to be compiled.

  3. PclZip (http://www.phpconcept.net/pclzip) gives me a valid recourse handle (zlib is supported), but I run into this issue (fseek is frequently 0): https://code.google.com/p/googleappengine/issues/detail?id=10881

Does anybody have an idea how to upload large zip files to GAE (PHP), unzip them and use them? I'm almost at the point where I'd ask users to extract the zip themselves, upload the extracted files separately and circumvent the entire unzip process.

Upvotes: 5

Views: 946

Answers (2)

Robert Moskal
Robert Moskal

Reputation: 22553

Wow, working with large zip files on GAE and gcloud storage appears to have caused a lot of people grief.

Long story short, no php method will unzip the file in the bucket. Witness... What you can do is to upload the file to the /tmp folder of your GAE instance and then use a stream to extract its contents the bucket.

$temp = '/tmp/source.zip';
$zip = new ZipArchive;
    $res = $zip->open($temp);
    if ($res === TRUE) {
        $fp = $zip->getStream('somefile.csv');
        $target = fopen('gs://foo.com/target.csv,  'w');
        stream_copy_to_stream($fp, $target);
        fclose($fp);
        fclose($target);
        $zip->close();
        unlink($temp);

    } else {
        $this->log->error('Could not unzip the hotels file');
    }

This is not ideal since storage in /tmp folder is carved out of your instance memory. So you need to specify an instance type that is large enough to fit the source zip file in memory along with your application.

In my case my source was about 150MB with an extracted size of 2.8GB. I was able to handle my upload with the standard F2/B2 instance that has 256MB.

Upvotes: 1

MarkS
MarkS

Reputation: 21

I know it's been a while, but I also tried the approaches above and they did not work. I was about throw in the towel with GAE, and then found TbsZip.

With it I was able to upload an ODT (which has a zip format), extract a file, modify it, then stuff it back into a copy of the original zip and send it back to the user. Haven't extensively tested it, but it does seem to be a reasonable workaround until Google fixes GAE.

The API is similar to ZipArchive. I'm using it with the current GAE SDK (1.9.15).

Hope this helps!

Upvotes: 2

Related Questions