Rob Watts
Rob Watts

Reputation: 7146

Python requests not able to stream utf-8 encoded file

I'm trying to take advantage of http streaming for a file with unicode characters, but I'm getting a UnicodeEncodeError:

>>> requests.put(my_url, headers=my_headers, data=open('test.csv', 'r', encoding='utf-8'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../python3.5/site-packages/requests/api.py", line 126, in put
    return request('put', url, data=data, **kwargs)
  File ".../python3.5/site-packages/requests/api.py", line 58, in request
    return session.request(method=method, url=url, **kwargs)
  File ".../python3.5/site-packages/requests/sessions.py", line 518, in request
    resp = self.send(prep, **send_kwargs)
  File ".../python3.5/site-packages/requests/sessions.py", line 639, in send
    r = adapter.send(request, **kwargs)
  File ".../python3.5/site-packages/requests/adapters.py", line 438, in send
    timeout=timeout
  File ".../python3.5/site-packages/requests/packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File ".../python3.5/site-packages/requests/packages/urllib3/connectionpool.py", line 356, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File ".../python3.5/http/client.py", line 1107, in request
    self._send_request(method, url, body, headers)
  File ".../python3.5/http/client.py", line 1152, in _send_request
    self.endheaders(body)
  File ".../python3.5/http/client.py", line 1103, in endheaders
    self._send_output(message_body)
  File ".../python3.5/http/client.py", line 936, in _send_output
    self.send(message_body)
  File ".../python3.5/http/client.py", line 904, in send
    datablock = datablock.encode("iso-8859-1")
UnicodeEncodeError: 'latin-1' codec can't encode character '\u2122' in position 6375: ordinal not in range(256)

I get the error whether or not I include encoding='utf-8'. How can I send this file in a way that doesn't require loading the entire file into memory but still gets around the unicode encoding issue?

Upvotes: 2

Views: 2063

Answers (2)

Rob Watts
Rob Watts

Reputation: 7146

At least in my case, all I needed to do was open the file in binary mode:

>>> requests.put(my_url, headers=my_headers, data=open('test.csv', 'rb'))

By opening the file in binary mode, python did not try to encode the file and just passed it on directly to the url.

Upvotes: 2

zwer
zwer

Reputation: 25809

open(..., encoding="utf-8") doesn't encode the file contents, quite the opposite - using it you're telling open() to decode your file contents into a regular unicode string, which cannot be losslessly encoded into latin-1 needed for the request (yeah, HTTP be ancient like that) if it has 'special' characters. You need to encode your contents before sending them. Try with:

requests.put(my_url, headers=my_headers, data=open("test.csv", "r", encoding="utf-8").read().encode("utf-8"))

Tho that's a very bad form for dealing with file contents...

Upvotes: 0

Related Questions