Reputation: 361
I have a simple 'Python2' code to patch to my Bigip server with a Chinese character description.
The code sample is just like below:
# -*- coding: utf-8 -*-
import requests
import json
import base64
url = "https://10.13.7.17/mgmt/tm/ltm/virtual/~Project_6fd06a50b7824ae48386565786e94b38~test"
# the unicode is u'bu\u9000'
test2 = u"bu退"
# encode the unicode character to 'utf-8'
# 'utf-8' code is {'description': 'bu\xe9\x80\x80'}
data = {"description": test2.encode('utf-8')}
payload = data
headers = {
'Authorization': 'Basic YWRamdflk6YWRtaW5Aasdlfjkla',
}
# use json parameter rather than data.
response = requests.request("PATCH", url, headers=headers, json=payload, verify=False)
# if change the below line to this
# response = requests.request("PATCH", url, headers=headers, data=test2.encode('utf-8'), verify=False)
# the HTTP Patch will succeed. but the underlayer code uses the `json` as the parameter, which I cannot simply modify.
print(response.text)
The server will return an error like below:
{"code":400,"message":"double quotes are not balanced","errorStack":[],"apiError":26214401}
I dig into the Request model, I find the Request model set the 'utf-8' code back to the 'unicode' .
The Request model prepares its body. initially, the json parameter is {'description': 'bu\xe9\x80\x80'}
, after complexjson.dumps(json)
process, the body is set the character back to the unicode format '{"description": "bu\\u9000"}'
.
Then the code will not process 'utf-8' encode again, since if not isinstance(body, bytes)
is False
.
> /usr/lib/python2.7/site-packages/requests/models.py(459)prepare_body()
-> content_type = 'application/json'
(Pdb) l
454
455 if not data and json is not None:
456 # urllib3 requires a bytes-like body. Python 2's json.dumps
457 # provides this natively, but Python 3 gives a Unicode string.
458
459 -> content_type = 'application/json'
460 body = complexjson.dumps(json)
461 if not isinstance(body, bytes):
462 body = body.encode('utf-8')
463
464 is_stream = all([
(Pdb) p json
{'description': 'bu\xe9\x80\x80'}
(Pdb) n
> /usr/lib/python2.7/site-packages/requests/models.py(460)prepare_body()
-> body = complexjson.dumps(json)
(Pdb) n
> /usr/lib/python2.7/site-packages/requests/models.py(461)prepare_body()
-> if not isinstance(body, bytes):
(Pdb) p body
'{"description": "bu\\u9000"}'
(Pdb) n
> /usr/lib/python2.7/site-packages/requests/models.py(464)prepare_body()
-> is_stream = all([
refer to this github comment, the dumps(json)
"By preset, the Python JSON library will automated entweichen select non-ASCII unicode code points".
On python2
json.dumps(data)
'{"description": "bu\\u9000"}'
json.dumps(data, ensure_ascii=False)
u'{"description": "bu\u9000"}'
json.dumps(data).encode('utf-8')
'{"description": "bu\\u9000"}'
json.dumps(data, ensure_ascii=False).encode('utf-8')
'{"description": "bu\xe9\x80\x80"}'
Is there any way to use the json
parameter of the 'requests.request' method to post/patch exotic characters? or is it the server-side problem that cannot decode unicode as 'utf-8'?
anyone can help? thanks.
Upvotes: 0
Views: 233
Reputation: 308
The value of your data has a binary string. Try this: test2.encode('utf-8').decode()
to convert it to a string.
Upvotes: 0