user001
user001

Reputation: 1848

chunked partial download using google drive python API

Chunked downloading of files using the google drive API (v3) can be done using the MediaIoBaseDownload method in conjunction with a request object created by request = service.files().get_media(fileId=<id>).

Partial downloading can be done by modifying the Range parameter of the HTTP header, as explained in this post:

request.headers["Range"] = "bytes={}-{}".format(start, start+length)

However, the two cannot be combined, as the byte range information in the header is ignored by MediaIoBaseDownload.

How can a partial download be accomplished in a chunked manner?

Upvotes: 2

Views: 677

Answers (1)

user001
user001

Reputation: 1848

This is a partial answer, which addresses the start byte of a range, but not the end byte.

As Tanaike pointed out in the comments, MediaIoBaseDownload ignores a user-supplied HTTP Range. A range, specified as follows:

request.headers["Range"] = "bytes={}-{}".format(start, start+length)

actually gets added to self._headers in the MediaIoBaseDownload constructor, but is promptly overwritten, on the first call to the next_chunk method, where headers['range'] is set to 'bytes=%d-%d' % (self._progress, self._progress + self._chunksize). On the first call, self._progress=0 (set by the constructor), so the method will always start a download from the first (zeroth) byte of the file.

There are a few simple ways to change this. We could check whether request.headers['Range'] exists and parse the specified byte positions. Alternatively, we could expose the behavior directly to the caller by adding additional keyword arguments to the constructor for passing starting and ending byte positions.

The following patch (against version 1.7.11 of the googleapiclient) takes the approach of adding a start keyword argument to the MediaIoBaseDownload constructor, so that a download can begin from the Nth byte. If a starting byte is not specified, it will default to downloading from the beginning of the file. Since support for an ending byte position has not been implemented, the download will continue until EOF.

--- googleapiclient/http.py.orig    2019-08-05 12:24:31.000000000 -0700
+++ googleapiclient/http.py 2020-01-19 18:31:56.785404831 -0800
@@ -632,7 +632,7 @@
   """

   @util.positional(3)
-  def __init__(self, fd, request, chunksize=DEFAULT_CHUNK_SIZE):
+  def __init__(self, fd, request, chunksize=DEFAULT_CHUNK_SIZE, start=0):
     """Constructor.

     Args:
@@ -646,7 +646,7 @@
     self._request = request
     self._uri = request.uri
     self._chunksize = chunksize
-    self._progress = 0
+    self._progress = start
     self._total_size = None
     self._done = False

Upvotes: 1

Related Questions