Reputation: 1707
I want to stream a large file into a gzip file directly, instead of downloading it all into memory and then compressing. This is how far I have gotten (does not work). I know how to just download a file in python and save and I know how to compress one, it is the streaming part that does not work.
Note: this linked csv is not large, it is just an example url.
import requests
import zlib
url = f"http://samplecsvs.s3.amazonaws.com/Sacramentorealestatetransactions.csv"
with requests.get(url, stream=True) as r:
compressor = zlib.compressobj()
with open(save_file_path, 'wb') as f:
f.write(compressor.compress(r.raw))
Upvotes: 4
Views: 1349
Reputation: 1707
Alright I figured it out:
with requests.get(url, stream=True, verify=False) as r:
if save_file_path.endswith('gz'):
compressor = zlib.compressobj(9, zlib.DEFLATED, zlib.MAX_WBITS | 16)
with open(save_file_path, 'wb') as f:
for chunk in r.iter_content(chunk_size=1024*1024):
f.write(compressor.compress(chunk))
f.write(compressor.flush())
else:
with open(save_file_path, 'wb') as f:
shutil.copyfileobj(r.raw, f)
Upvotes: 1