Cliff
Cliff

Reputation: 79

Uploading large zip files to a website using Python

I have the following problem: I need to upload large .zip-files (usually >500MB with a maximum of ca 5GB) to a website which then processes these files. I do this in Python 2.7.16 on Windows 32-Bit. Sadly I cannot change my setup (from 32-Bit to 64-Bit) nor can I install additional Python plugins (I have requests, urllib and urllib2 and several othersinstalled) due to company restrictions. My code looks like this now:

 import requests

 FileList=["C:\File01.zip", "C:\FileA02.zip", "C:\UserFile993.zip"]
 UploadURL = "https://mywebsite.com/submitFile"
 for FilePath in FileList:
    print("Upload file: "+str(FilePath))
    session = requests.Session()
        with open(FilePath, "rb") as file:
        session.post(UploadURL,data={'file':'Send file'},files={'FileToBeUploaded':FilePath})
    print("Upload done: "+str(FilePath))
    session.close()

Since my FileList is quite long (>100 entries), I just pasted here an excerpt of it. The code above works well if there are file below 600MB. Any file above that will throw me this error:

  File "<stdin>", line 1, in <module>
  File "C:\Users\AAA253\Desktop\DingDong.py", line 39, in <module>
    session.post(UploadURL,data={'file':'Send file'},files={'FileToBeUploaded':FilePath})
  File "C:\Python27\lib\site-packages\requests\sessions.py", line 522, in post
    return self.request('POST', url, data=data, json=json, **kwargs)
  File "C:\Python27\lib\site-packages\requests\sessions.py", line 461, in request
    prep = self.prepare_request(req)
  File "C:\Python27\lib\site-packages\requests\sessions.py", line 394, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "C:\Python27\lib\site-packages\requests\models.py", line 297, in prepare
    self.prepare_body(data, files, json)
  File "C:\Python27\lib\site-packages\requests\models.py", line 455, in prepare_body
    (body, content_type) = self._encode_files(files, data)
  File "C:\Python27\lib\site-packages\requests\models.py", line 158, in _encode_files
    body, content_type = encode_multipart_formdata(new_fields)
  File "C:\Python27\lib\site-packages\requests\packages\urllib3\filepost.py", line 86, in encode_multipart_formdata
    body.write(data)
MemoryError

I checked already the forum here to find some solutions, but sadly I could not find any suitable solution. Anybody has an idea on how to get this done? Could it be made by loading the file in chunks? If so, how to upload the file in chunks, so that the server does not "cancel" the operation?

Edit: using the answer from @AKX I use this code:

import requests
from requests_toolbelt.multipart import encoder

FileList=["C:\File01.zip", "C:\FileA02.zip", "C:\UserFile993.zip"]
UploadURL = "https://mywebsite.com/submitFile"
for FilePath in FileList:
    session = requests.Session()
    with open(FilePath, 'rb') as f:
        form = encoder.MultipartEncoder({"documents": (FilePath, f, "application/octet-stream"),"composite": "NONE",})
        headers = {"Prefer": "respond-async", "Content-Type": form.content_type}
        resp = session.post(UploadURL,data={'file':'Send file'},files={'FileToBeUploaded':form})
    session.close()

Nevertheless I get nearly the same errors:

    File "<stdin>", line 1, in <module>
  File "C:\Users\AAA253\Desktop\DingDong.py", line 48, in <module>
    resp =  session.post(UploadURL,data={'file':'Send file'},files={'FileToBeUploaded':form})
  File "C:\Python27\lib\site-packages\requests-2.24.0-py2.7.egg\requests\sessions.py", line 578, in post
    return self.request('POST', url, data=data, json=json, **kwargs)
  File "C:\Python27\lib\site-packages\requests-2.24.0-py2.7.egg\requests\sessions.py", line 516, in request
    prep = self.prepare_request(req)
  File "C:\Python27\lib\site-packages\requests-2.24.0-py2.7.egg\requests\sessions.py", line 459, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "C:\Python27\lib\site-packages\requests-2.24.0-py2.7.egg\requests\models.py", line 317, in prepare
    self.prepare_body(data, files, json)
  File "C:\Python27\lib\site-packages\requests-2.24.0-py2.7.egg\requests\models.py", line 505, in prepare_body
    (body, content_type) = self._encode_files(files, data)
  File "C:\Python27\lib\site-packages\requests-2.24.0-py2.7.egg\requests\models.py", line 159, in _encode_files
    fdata = fp.read()
  File "build\bdist.win32\egg\requests_toolbelt\multipart\encoder.py", line 314, in read
  File "build\bdist.win32\egg\requests_toolbelt\multipart\encoder.py", line 194, in _load
  File "build\bdist.win32\egg\requests_toolbelt\multipart\encoder.py", line 256, in _write
  File "build\bdist.win32\egg\requests_toolbelt\multipart\encoder.py", line 552, in append
MemoryError

Upvotes: 0

Views: 1230

Answers (1)

AKX
AKX

Reputation: 168873

You will more likely than not need the requests-toolbelt streaming MultipartEncoder.

Even if your company restrictions forbid installing new packages, you can likely vendor in the parts of requests_toolbelt you need (maybe the whole package) into your project's directory.

Upvotes: 1

Related Questions