Reputation: 1848
Chunked downloading of files using the google drive
API (v3) can be done using the MediaIoBaseDownload
method in conjunction with a request object created by request = service.files().get_media(fileId=<id>)
.
Partial downloading can be done by modifying the Range
parameter of the HTTP header, as explained in this post:
request.headers["Range"] = "bytes={}-{}".format(start, start+length)
However, the two cannot be combined, as the byte range information in the header is ignored by MediaIoBaseDownload
.
How can a partial download be accomplished in a chunked manner?
Upvotes: 2
Views: 677
Reputation: 1848
This is a partial answer, which addresses the start byte of a range, but not the end byte.
As Tanaike pointed out in the comments, MediaIoBaseDownload
ignores a user-supplied HTTP Range
. A range, specified as follows:
request.headers["Range"] = "bytes={}-{}".format(start, start+length)
actually gets added to self._headers
in the MediaIoBaseDownload
constructor, but is promptly overwritten, on the first call to the next_chunk
method, where headers['range']
is set to 'bytes=%d-%d' % (self._progress, self._progress + self._chunksize)
. On the first call, self._progress=0
(set by the constructor), so the method will always start a download from the first (zeroth) byte of the file.
There are a few simple ways to change this. We could check whether request.headers['Range']
exists and parse the specified byte positions. Alternatively, we could expose the behavior directly to the caller by adding additional keyword arguments to the constructor for passing starting and ending byte positions.
The following patch (against version 1.7.11 of the googleapiclient
) takes the approach of adding a start
keyword argument to the MediaIoBaseDownload
constructor, so that a download can begin from the Nth
byte. If a starting byte is not specified, it will default to downloading from the beginning of the file. Since support for an ending byte position has not been implemented, the download will continue until EOF
.
--- googleapiclient/http.py.orig 2019-08-05 12:24:31.000000000 -0700
+++ googleapiclient/http.py 2020-01-19 18:31:56.785404831 -0800
@@ -632,7 +632,7 @@
"""
@util.positional(3)
- def __init__(self, fd, request, chunksize=DEFAULT_CHUNK_SIZE):
+ def __init__(self, fd, request, chunksize=DEFAULT_CHUNK_SIZE, start=0):
"""Constructor.
Args:
@@ -646,7 +646,7 @@
self._request = request
self._uri = request.uri
self._chunksize = chunksize
- self._progress = 0
+ self._progress = start
self._total_size = None
self._done = False
Upvotes: 1