user3524641
user3524641

Reputation: 31

POST Binary (video) File Using Python Requests

I have a working bit of PHP code that uploads a binary to a remote server I don't have shell access to. The PHP code is:

function upload($uri, $filename) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $uri);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, array('file' => '@' . $filename));
curl_exec($ch);
curl_close($ch);
}

This results in a header like:

HTTP/1.1
Host: XXXXXXXXX
Accept: */*
Content-Length: 208045596
Expect: 100-continue
Content-Type: multipart/form-data; boundary=----------------------------360aaccde050

I'm trying to port this over to python using requests and I cannot get the server to accept my POST. I have tried every which way to use requests.post, but the header will not mimic the above.

This will successfully transfer the binary to the server (can tell by watching wireshark) but because the header is not what the server is expecting it gets rejected. The response_code is a 200 though.

files = {'bulk_test2.mov': ('bulk_test2.mov', open('bulk_test2.mov', 'rb'))}
response = requests.post(url, files=files)

The requests code results in a header of:

HTTP/1.1
Host: XXXX
Content-Length: 160
Content-Type: multipart/form-data; boundary=250852d250b24399977f365f35c4e060
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: python-requests/2.2.1 CPython/2.7.5 Darwin/13.1.0

--250852d250b24399977f365f35c4e060
Content-Disposition: form-data; name="bulk_test2.mov"; filename="bulk_test2.mov"


--250852d250b24399977f365f35c4e060--

Any thoughts on how to make requests match the header that the PHP code generates?

Upvotes: 3

Views: 11276

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1122142

There are two large differences:

  1. The PHP code posts a field named file, your Python code posts a field named bulk_test2.mov.

  2. Your Python code posts an empty file. There Content-Length header is 160 bytes, exactly the amount of space the multipart boundaries and Content-Disposition part header take up. Either the bulk_test2.mov file is indeed empty, or you tried to post the file multiple times without rewinding or reopening the file object.

To fix the first problem, use 'file' as the key in your files dictionary:

files = {'file': open('bulk_test2.mov', 'rb')}
response = requests.post(url, files=files)

I used just the open file object as the value; requests will get the filename directly from the file object in that case.

The second issue is something only you can fix. Make sure you don't reuse files when repeatedly posting. Reopen, or use files['file'].seek(0) to rewind the read position back to the start.

The Expect: 100-continue header is an optional client feature that asks the server to confirm that the body upload can go ahead; it is not a required header and any failure to post your file object is not going to be due to requests using this feature or not. If an HTTP server were to misbehave if you don't use this feature, it is in violation of the HTTP RFCs and you'll have bigger problems on your hands. It certainly won't be something requests can fix for you.

If you do manage to post actual file data, any small variations in Content-Length are due to the (random) boundary being a different length between Python and PHP. This is normal, and not the cause of upload problems, unless your target server is extremely broken. Again, don't try to fix such brokenness with Python.

However, I'd assume you overlooked something much simpler. Perhaps the server blacklists certain User-Agent headers, for example. You could clear some of the default headers requests sets by using a Session object:

files = {'file': open('bulk_test2.mov', 'rb')}
session = requests.Session()
del session.headers['User-Agent']
del session.headers['Accept-Encoding']
response = session.post(url, files=files)

and see if that makes a difference.

If the server fails to handle your request because it fails to handle HTTP persistent connections, you could try to use the session as a context manager to ensure that all session connections are closed:

files = {'file': open('bulk_test2.mov', 'rb')}
with requests.Session() as session:
    response = session.post(url, files=files, stream=True)

and you could add:

response.raw.close()

for good measure.

Upvotes: 6

Related Questions