DankMemes
DankMemes

Reputation: 2137

Google Drive PHP API - How to stream a large file

I know how to download files all at once, as shown in the example on this page: Google Drive API -Files: get.

However, if I have a very large file, loading it all into memory would be inefficient and resource-hogging. Therefore, I would like to know if it's possible to stream files with the Drive API, only loading bits of the file into memory at a time, and processing them (in some way like writing to a file or writing directly to output). I've read the docs and even looked a bit at the source for the PHP Google Drive SDK, and it seems that there is support for streaming, but I can't figure out how to use it. All help is appreciated.

Upvotes: 6

Views: 7153

Answers (4)

evtuhovdo
evtuhovdo

Reputation: 334

Good example in library git repo https://github.com/google/google-api-php-client/blob/master/examples/large-file-download.php

<?php
      /*
       * Copyright 2011 Google Inc.
       *
       * Licensed under the Apache License, Version 2.0 (the "License");
       * you may not use this file except in compliance with the License.
       * You may obtain a copy of the License at
       *
       *     http://www.apache.org/licenses/LICENSE-2.0
       *
       * Unless required by applicable law or agreed to in writing, software
       * distributed under the License is distributed on an "AS IS" BASIS,
       * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
       * See the License for the specific language governing permissions and
       * limitations under the License.
       */

  include_once __DIR__ . '/../vendor/autoload.php';
include_once "templates/base.php";

echo pageHeader("File Download - Downloading a large file");

/*************************************************
 * Ensure you've downloaded your oauth credentials
 ************************************************/
if (!$oauth_credentials = getOAuthCredentialsFile()) {
    echo missingOAuth2CredentialsWarning();
    return;
}

/************************************************
 * The redirect URI is to the current page, e.g:
 * http://localhost:8080/large-file-download.php
 ************************************************/
$redirect_uri = 'http://' . $_SERVER['HTTP_HOST'] . $_SERVER['PHP_SELF'];

$client = new Google_Client();
$client->setAuthConfig($oauth_credentials);
$client->setRedirectUri($redirect_uri);
$client->addScope("https://www.googleapis.com/auth/drive");
$service = new Google_Service_Drive($client);

/************************************************
 * If we have a code back from the OAuth 2.0 flow,
 * we need to exchange that with the
 * Google_Client::fetchAccessTokenWithAuthCode()
 * function. We store the resultant access token
 * bundle in the session, and redirect to ourself.
 ************************************************/
if (isset($_GET['code'])) {
    $token = $client->fetchAccessTokenWithAuthCode($_GET['code']);
    $client->setAccessToken($token);

// store in the session also
$_SESSION['upload_token'] = $token;

// redirect back to the example
header('Location: ' . filter_var($redirect_uri, FILTER_SANITIZE_URL));
}

// set the access token as part of the client
   if (!empty($_SESSION['upload_token'])) {
    $client->setAccessToken($_SESSION['upload_token']);
if ($client->isAccessTokenExpired()) {
unset($_SESSION['upload_token']);
}
} else {
      $authUrl = $client->createAuthUrl();
  }

/************************************************
 * If we're signed in then lets try to download our
 * file.
 ************************************************/
if ($client->getAccessToken()) {
// Check for "Big File" and include the file ID and size
$files = $service->files->listFiles([
'q' => "name='Big File'",
'fields' => 'files(id,size)'
]);

if (count($files) == 0) {
    echo "
    <h3 class='warn'>
    Before you can use this sample, you need to
    <a href='/large-file-upload.php'>upload a large file to Drive</a>.
    </h3>";
    return;
}

// If this is a POST, download the file
if ($_SERVER['REQUEST_METHOD'] == 'POST') {
// Determine the file's size and ID
$fileId = $files[0]->id;
    $fileSize = intval($files[0]->size);

// Get the authorized Guzzle HTTP client
$http = $client->authorize();

// Open a file for writing
$fp = fopen('Big File (downloaded)', 'w');

// Download in 1 MB chunks
$chunkSizeBytes = 1 * 1024 * 1024;
    $chunkStart = 0;

// Iterate over each chunk and write it to our file
   while ($chunkStart < $fileSize) {
    $chunkEnd = $chunkStart + $chunkSizeBytes;
    $response = $http->request(
    'GET',
    sprintf('/drive/v3/files/%s', $fileId),
    [
    'query' => ['alt' => 'media'],
    'headers' => [
    'Range' => sprintf('bytes=%s-%s', $chunkStart, $chunkEnd)
    ]
    ]
    );
    $chunkStart = $chunkEnd + 1;
fwrite($fp, $response->getBody()->getContents());
}
// close the file pointer
fclose($fp);

// redirect back to this example
header('Location: ' . filter_var($redirect_uri . '?downloaded', FILTER_SANITIZE_URL));
}
}
?>

<div class="box">
<?php if (isset($authUrl)): ?>
<div class="request">
<a class='login' href='<?= $authUrl ?>'>Connect Me!</a>
                                                     </div>
                                                       <?php elseif(isset($_GET['downloaded'])): ?>
<div class="shortened">
<p>Your call was successful! Check your filesystem for the file:</p>
                                                                  <p><code><?= __DIR__ . DIRECTORY_SEPARATOR ?>Big File (downloaded)</code></p>
                                                                                                                                             </div>
                                                                                                                                               <?php else: ?>
<form method="POST">
<input type="submit" value="Click here to download a large (20MB) test file" />
</form>
  <?php endif ?>
    </div>

      <?= pageFooter(__FILE__) ?>

Upvotes: 1

Fabio Vitale
Fabio Vitale

Reputation: 1

I have created MediaFileDownload.php (from MediaFileUpload.php present in google api for php)

<?php
/**
 * Copyright 2012 Google Inc.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

use GuzzleHttp\Psr7;
use GuzzleHttp\Psr7\Request;
use GuzzleHttp\Psr7\Uri;
use Psr\Http\Message\RequestInterface;

/**
 * Manage large file downloads, which may be media but can be any type
 * of sizable data.
 */
class Google_Http_MediaFileDownload
{
  const DOWNLOAD_MEDIA_TYPE = 'media';
  const DOWNLOAD_MULTIPART_TYPE = 'multipart';
  const DOWNLOAD_RESUMABLE_TYPE = 'resumable';

  /** @var string $mimeType */
  private $mimeType;

  /** @var string $data */
  private $data;

  /** @var bool $resumable */
  private $resumable;

  /** @var int $chunkSize */
  private $chunkSize;

  /** @var int $size */
  private $size;

  /** @var string $resumeUri */
  private $resumeUri;

  /** @var int $progress */
  private $progress;

  /** @var Google_Client */
  private $client;

  private $logger;

  /** @var Psr\Http\Message\RequestInterface */
  private $request;

  /** @var string */
  private $boundary;

  /**
   * Result code from last HTTP call
   * @var int
   */
  private $httpResultCode;

  /**
   * @param $mimeType string
   * @param $data string The bytes you want to download.
   * @param $resumable bool
   * @param bool $chunkSize File will be downloaded in chunks of this many bytes.
   * only used if resumable=True
   */
  public function __construct(
      Google_Client $client,
      RequestInterface $request,
      $mimeType,
      $data,
      $logger = null,
      $resumable = false,
      $chunkSize = false
  ) {
    $this->logger = $logger;
    $this->client = $client;
    $this->request = $request;
    $this->mimeType = $mimeType;
    $this->data = $data;
    $this->resumable = $resumable;
    $this->chunkSize = $chunkSize;
    $this->progress = 0;

    //$this->process();
  }

  /**
   * Set the size of the file that is being downloaded.
   * @param $size - int file size in bytes
   */
  public function setFileSize($size)
  {
    $this->size = $size;
  }

  /**
   * Return the progress on the download
   * @return int progress in bytes downloaded.
   */
  public function getProgress()
  {
    return $this->progress;
  }

  /**
   * Send the next part of the file to download.
   * @param [$chunk] the next set of bytes to send. If false will used $data passed
   * at construct time.
   */
  public function nextChunk($chunk = false)
  {
    //$resumeUri = $this->getResumeUri();

//     if (false == $chunk) {
//       $chunk = substr($this->data, $this->progress, $this->chunkSize);
//     }

//     $lastBytePos = $this->progress + strlen($chunk) - 1;

//     $headers = array(
//       'content-range' => "bytes $this->progress-$lastBytePos/$this->size",
//       'content-length' => strlen($chunk),
//       'expect' => '',
//     );

//     $request = new Request(
//         'PUT',
//         $resumeUri,
//         $headers,
//         Psr7\stream_for($chunk)
//     );

        //TODO FV calculate start end range from response if present ????

        $resumeUri = $this->getResumeUri();
        $lastBytePos = $this->progress + $this->chunkSize - 1;

        $lastBytePosSize = $this->size - 1;

        $lastBytePos = min($lastBytePos,$lastBytePosSize);

        $headers = array('Range' => "bytes=$this->progress-$lastBytePos");

        $this->logger->info("'Range' [" . "bytes=$this->progress-$lastBytePos" . "]");

        $this->logger->info("resumeUri [" . $resumeUri . "]");

        $request = new Request(
            'GET',
            $resumeUri,
            $headers
        );


    return $this->makeGetRequest($request);
  }

  /**
   * Return the HTTP result code from the last call made.
   * @return int code
   */
  public function getHttpResultCode()
  {
    return $this->httpResultCode;
  }

  /**
  * Sends a PUT-Request to google drive and parses the response,
  * setting the appropiate variables from the response()
  *
  * @param Google_Http_Request $httpRequest the Reuqest which will be send
  *
  * @return false|mixed false when the download is unfinished or the decoded http response
  *
  */
  private function makePutRequest(RequestInterface $request)
  {
    $response = $this->client->execute($request);
    $this->httpResultCode = $response->getStatusCode();

    if (308 == $this->httpResultCode) {
      // Track the amount downloaded.
      $range = explode('-', $response->getHeaderLine('range'));
      $this->progress = $range[1] + 1;

      // Allow for changing download URLs.
      $location = $response->getHeaderLine('location');
      if ($location) {
        $this->resumeUri = $location;
      }

      // No problems, but download not complete.
      return false;
    }

    return Google_Http_REST::decodeHttpResponse($response, $this->request);
  }

  private function makeGetRequest(RequestInterface $request)
  {
    $response = $this->client->execute($request);
    $this->httpResultCode = $response->getStatusCode();

    $this->logger->info("httpResultCode [" . $this->httpResultCode . "]");

    //if (300 == $this->httpResultCode || 200 == $this->httpResultCode) {

    if ($this->httpResultCode >= 200 && $this->httpResultCode < 300) {

        // Track the amount downloaded.
//          $range = explode('-', $response->getHeaderLine('range'));
//          $this->progress = $range[1] + 1;


        $range = explode('-', $response->getHeaderLine('content-range'));
        $this->logger->info("range[0] [" . $range[0] . "]");
        $this->logger->info("range[1] [" . $range[1] . "]");

        $range = explode('/', $range[1]);
        $this->logger->info("range[0] [" . $range[0] . "]");

        $this->progress = $range[0] + 1;
        $this->logger->info("progress [" . $this->progress . "]");

        // Allow for changing download URLs.
        $location = $response->getHeaderLine('location');
        if ($location) {
            $this->resumeUri = $location;
            $this->logger->info("resumeUri from location [" . $this->resumeUri . "]");
        }

        // No problems, but download not complete.
        //return false;
        return $response;
    }
    else if($this->httpResultCode >= 400)
    {
        return Google_Http_REST::decodeHttpResponse($response, $this->request);

    }

    return false;

    //return Google_Http_REST::decodeHttpResponse($response, $this->request);

  }


  /**
   * Resume a previously unfinished download
   * @param $resumeUri the resume-URI of the unfinished, resumable download.
   */
  public function resume($resumeUri)
  {
     $this->resumeUri = $resumeUri;
     $headers = array(
       'content-range' => "bytes */$this->size",
       'content-length' => 0,
     );
     $httpRequest = new Request(
         'PUT',
         $this->resumeUri,
         $headers
     );

     return $this->makePutRequest($httpRequest);
  }

  /**
   * @return Psr\Http\Message\RequestInterface $request
   * @visible for testing
   */
  private function process()
  {
    $this->transformToDownloadUrl();
    $request = $this->request;

    $postBody = '';
    $contentType = false;

    $meta = (string) $request->getBody();
    $meta = is_string($meta) ? json_decode($meta, true) : $meta;

    $downloadType = $this->getDownloadType($meta);
    $request = $request->withUri(
        Uri::withQueryValue($request->getUri(), 'downloadType', $downloadType)
    );

    $mimeType = $this->mimeType ?
        $this->mimeType :
        $request->getHeaderLine('content-type');

    if (self::DOWNLOAD_RESUMABLE_TYPE == $downloadType) {
      $contentType = $mimeType;
      $postBody = is_string($meta) ? $meta : json_encode($meta);
    } else if (self::DOWNLOAD_MEDIA_TYPE == $downloadType) {
      $contentType = $mimeType;
      $postBody = $this->data;
    } else if (self::DOWNLOAD_MULTIPART_TYPE == $downloadType) {
      // This is a multipart/related download.
      $boundary = $this->boundary ? $this->boundary : mt_rand();
      $boundary = str_replace('"', '', $boundary);
      $contentType = 'multipart/related; boundary=' . $boundary;
      $related = "--$boundary\r\n";
      $related .= "Content-Type: application/json; charset=UTF-8\r\n";
      $related .= "\r\n" . json_encode($meta) . "\r\n";
      $related .= "--$boundary\r\n";
      $related .= "Content-Type: $mimeType\r\n";
      $related .= "Content-Transfer-Encoding: base64\r\n";
      $related .= "\r\n" . base64_encode($this->data) . "\r\n";
      $related .= "--$boundary--";
      $postBody = $related;
    }

    $request = $request->withBody(Psr7\stream_for($postBody));

    if (isset($contentType) && $contentType) {
      $request = $request->withHeader('content-type', $contentType);
    }

    return $this->request = $request;
  }

  /**
   * Valid download types:
   * - resumable (DOWNLOAD_RESUMABLE_TYPE)
   * - media (DOWNLOAD_MEDIA_TYPE)
   * - multipart (DOWNLOAD_MULTIPART_TYPE)
   * @param $meta
   * @return string
   * @visible for testing
   */
  public function getDownloadType($meta)
  {
    if ($this->resumable) {
      return self::DOWNLOAD_RESUMABLE_TYPE;
    }

    if (false == $meta && $this->data) {
      return self::DOWNLOAD_MEDIA_TYPE;
    }

    return self::DOWNLOAD_MULTIPART_TYPE;
  }

  public function getResumeUri()
  {
    if (is_null($this->resumeUri)) {
      //$this->resumeUri = $this->fetchResumeUri();
        $this->resumeUri = $this->request->getUri();
    }

    return $this->resumeUri;
  }

  private function fetchResumeUri()
  {
    $result = null;
    $body = $this->request->getBody();
    if ($body) {
      $headers = array(
        'content-type' => 'application/json; charset=UTF-8',
        'content-length' => $body->getSize(),
        'x-download-content-type' => $this->mimeType,
        'x-download-content-length' => $this->size,
        'expect' => '',
      );
      foreach ($headers as $key => $value) {
        $this->request = $this->request->withHeader($key, $value);
      }
    }

    $response = $this->client->execute($this->request, false);
    $location = $response->getHeaderLine('location');
    $code = $response->getStatusCode();

    if (200 == $code && true == $location) {
      return $location;
    }

    $message = $code;
    $body = json_decode((string) $this->request->getBody(), true);
    if (isset($body['error']['errors'])) {
      $message .= ': ';
      foreach ($body['error']['errors'] as $error) {
        $message .= "{$error[domain]}, {$error[message]};";
      }
      $message = rtrim($message, ';');
    }

    $error = "Failed to start the resumable download (HTTP {$message})";
    $this->client->getLogger()->error($error);

    throw new Google_Exception($error);
  }

  private function transformToDownloadUrl()
  {
    $parts = parse_url((string) $this->request->getUri());
    if (!isset($parts['path'])) {
      $parts['path'] = '';
    }
    $parts['path'] = '/download' . $parts['path'];
    $uri = Uri::fromParts($parts);
    $this->request = $this->request->withUri($uri);
  }

  public function setChunkSize($chunkSize)
  {
    $this->chunkSize = $chunkSize;
  }

  public function getRequest()
  {
    return $this->request;
  }
}

To use it

                                    header('Content-Type: application/octet-stream');
                                    //header('Content-Type: ' . $googledrivefile->getMimeType());
                                    //header('Content-Disposition: attachment; filename='.basename($this->real_file));
                                    header('Content-Disposition: attachment; filename='.$googledrivefile->getTitle());
                                    header('Expires: 0');
                                    header('Pragma: public');
                                    header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
                                    //header('Content-Length: ' . get_real_size($this->real_file));
                                    header("Content-Description: File Transfer");
                                    header('Content-Length: ' . $googledrivefile->getFileSize());
                                    //readfile($this->real_file);
                                    //readfile(UPLOADED_FILES_FOLDER . $googledrivefile->getTitle());

                                    $chunkSizeBytes = ((int)GD_CHUNKSIZE_DOWNLOAD_MB) * 1024 * 1024;

                                    if ($googledrivefile->getFileSize() <= $chunkSizeBytes)
                                    {

                                        header('Cache-Control: private',false);
                                        header('Content-Transfer-Encoding: binary');
                                        header('Connection: close');

                                        //Call without MediaFileDownload
                                        echo $this->getgoogledriverequestbodyFile($this->googledriveservice, $googledrivefile,$this->logger);

                                    //FV Begin Commented because the file downloaded is corrupted
                                    }
                                    else
                                    {

                                        header('Cache-Control: private',false);
                                        //header("Cache-Control: public");
                                        header('Content-Transfer-Encoding: chunked');
                                        header('Connection: keep-alive');


                                        try {
                                            //Begin Use of MediaFileDownload
                                            $this->logger->info("Begin Use of MediaFileDownload...");
                                            // Call the API with the media upload, defer so it doesn't immediately return.
                                            $this->googledriveclient->setDefer(true);

                                            $request = $this->googledriveservice->files->get($googledrivefile->getId(), array(
                                                    'alt' => 'media' ));

//                                              $downloadUrl = $googledrivefile->getDownloadUrl(); //for google doc is empty
//                                              //$downloadUrl = "https://www.googleapis.com/drive/v3/files/" . $googledrivefile->getId() . "?alt=media";

//                                              $request = new Request(
//                                                      'GET',
//                                                      $downloadUrl
//                                              );

                                            $this->logger->info("mediadownloadfromgoogledriveincurrentfolder request->getUri() [" . $request->getUri() . "]");

                                            // Create a media file upload to represent our upload process.
                                            $media = new Google_Http_MediaFileDownload(
                                                    $this->googledriveclient,
                                                    $request,
                                                    $googledrivefile->getMimeType(),
                                                    null,
                                                    $this->logger,
                                                    true,
                                                    $chunkSizeBytes
                                            );

                                            $media->setFileSize($googledrivefile->getFileSize());

                                            $status = true;
                                            $progress = 0;
                                            $previousprogress = 0;

                                            while ($status) {
                                                $this->logger->info("mediadownloadfromgoogledriveincurrentfolder read next chunk ");
                                                $status = $media->nextChunk();

                                                if(!$status)
                                                {
                                                    $this->logger->info("mediadownloadfromgoogledriveincurrentfolder an error occured ");
                                                    break;
                                                }

                                                $response = $status;

                                                $range = explode('-', $response->getHeaderLine('content-range'));
                                                $this->logger->info("mediadownloadfromgoogledriveincurrentfolder range[1] [" . $range[1] . "]");

                                                $range = explode('/', $range[1]);
                                                $this->logger->info("mediadownloadfromgoogledriveincurrentfolder range[0] [" . $range[0] . "]");

                                                $progress = $range[0];
                                                $mediaSize = $range[1];
                                                $this->logger->info("mediadownloadfromgoogledriveincurrentfolder progress [" . $progress . "]");
                                                $this->logger->info("mediadownloadfromgoogledriveincurrentfolder mediaSize [" . $mediaSize . "]");


                                                if($progress > $previousprogress)
                                                {
                                                    //Flush the content
                                                    //$contentbody = $response->getBody()->__toString();
                                                    $contentbody = $response->getBody();

                                                    //$this->logger->info("content " . $contentbody);

                                                    //Clean buffer and end buffering
                                                    while (ob_get_level()) ob_end_clean();

                                                    //Start buffering
                                                    //ob_implicit_flush();
                                                    if (!ob_get_level()) ob_start();

                                                    echo $contentbody;
                                                    ob_flush();
                                                    flush();

                                                    $previousprogress = $progress;

                                                    //sleep(1);
                                                    //usleep(1000000);
                                                    usleep(5000);



                                                }                                               


                                                if(($mediaSize - 1) <= $progress)
                                                {

                                                    ob_end_flush();

                                                    //Clean buffer and end buffering
                                                    while (ob_get_level()) ob_end_clean();

                                                    $this->logger->info("mediadownloadfromgoogledriveincurrentfolder (mediaSize - 1) <= progress END OF FILE");
                                                    break;
                                                }
                                            }


                                        } catch (Google_Service_Exception $e) {

                                            $this->logger->error("mediadownloadfromgoogledriveincurrentfolder error Google_Service_Exception" . $e->getMessage(),$e);
                                            $this->logger->error("mediadownloadfromgoogledriveincurrentfolder error Google_Service_Exception errors " . var_export($e->getErrors(), true));

                                        } catch (Exception $e) {

                                            $this->logger->error("mediadownloadfromgoogledriveincurrentfolder error " . $e->getMessage(),$e);

                                        }
                                        finally {

                                            $this->googledriveclient->setDefer(false);
                                            $this->logger->info("End Use of MediaFileDownload...");
                                            //End Use of MediaFileDownload

                                        }

                                    }

Upvotes: 0

yasirfarooqui
yasirfarooqui

Reputation: 193

Downloading a large file all at once is not a good approach. So for example if you a have downloadable file of 1GB, you are actually creating a PHP variable of that size which can result into 'allowed memory size exceeded' error too.

The good idea is to download files in chunks. If you are using official Google SDK for PHP, you can do something similar to the following:-

$tmpFileName = tempnam(sys_get_temp_dir(),NULL);
$fp = fopen($tmpFileName,"wb");

$downloadUrl = $file->getDownloadUrl();
$request = new Google_Http_Request($downloadUrl, 'GET', null, null);
$request->setRequestHeaders(
  array('Range' => 'bytes=' . $start . '-' . $end)
); // you can loop through this line of code by calculating $start / $end with respect to the total size of file. (size of file in bytes)
$httpRequest = $service->getClient()->getAuth()->authenticatedRequest($request);
fwrite($fp, $httpRequest->getResponseBody());
// when executing last three lines of code in a do / while loop, its a good idea to set $httpRequest = NULL after writing the chunk to the file.

Upvotes: 1

Linda Lawton - DaImTo
Linda Lawton - DaImTo

Reputation: 116878

Yes it is possible to specify how much of the file to get there by getting it a chunk at a time. But i'm not sure if you can actually read what is in the file until its fully downloaded.

Google drive SDK download files - Partial download

Partial download

Partial download involves downloading only a specified portion of a file. You 
can specify the portion of the file you want to download by using a byte range 
with the Range header. For example:
Range: bytes=500-999

I did a quick scan of the php client lib and I'm not sure that it supports it. This may be something that needs to be added to the client lib or its something that you will have to code on your own with out using the client lib.

Upvotes: 1

Related Questions