Python http download using requests.get always missing a chunk

Question

I am trying to define a function that resumes download if the connection is broken. However, the following does not work as expected. In line 8, I have to manually deduce one chunk-size in order for it to work, otherwise, the final file will be missing exactly one chunk-size for each time I resume it.

if os.path.exists(fileName):
    header = requests.head(url)
    fileLength = int(header.headers['Content-Length'])
    if fileLength == os.path.getsize(fileName):
        return True
    else:
        with open(fileName, 'ab') as f:
            position = f.tell()-1024
            pos_header = {}
            print(position)
            pos_header['Range'] = f'bytes={position}-'

        with requests.get(url, headers = pos_header, stream = True) as r:
            with open(fileName, 'ab') as f:
                    #some validation should be here

                for chunk in r.iter_content(chunk_size=1024):
                    if chunk:
                        f.write(r.content)
                        f.flush()
                        print(os.path.getsize(fileName))

else:
    with requests.get(url, allow_redirects=True, stream = True) as r:
        with open(fileName, 'wb') as f:
            iter = 0
            for chunk in r.iter_content(chunk_size = 1024):
                if chunk:
                    f.write(chunk)
                    f.flush()
                    iter += 1
                if iter > 2000:
                    break

Interestingly, the part missing is the in-between two parts of the downloads. Is there a more elegant way of resolving this than what I did?

Python http download using requests.get always missing a chunk

Answers (1)

Related Questions