Reputation: 5426
I am using the following to convert the response result to object:
response = requests.get(url=request_url)
myobjs = json.loads(response.text, object_hook=lambda d: Myobj(**d))
return myobjs
and
class Myobj(object):
def __init__(self, id, display):
self.id = str(id)
self.name = str(display)
Sample JSON:
[
{
"id": "92cbb711-7e4d-417a-9530-f1850d9bc687",
"display": "010lf.com",
},
{
"id": "1060864a-a3a5-40c2-aa94-651fe2d10ae9",
"display": "010lm.com",
}
]
It works well until one day, one of the field display in the returned JSON contains unicode value for example:
"display": "관악저널.kr"
It will give the below error:
File "mycode.py", line 5, in __init__
self.name = str(display)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)
I would have thought the str() function would handle the unicode code string properly.
What is that I am missing?
I try to change the line from
self.name = str(display)
to
self.name = display
It seems to do the trick but I wanna check if I am doing it correctly and efficiently?
Upvotes: 0
Views: 125
Reputation: 177600
json
returns the strings as Unicode. So either store them as Unicode (the correct solution) or encode them in UTF-8. Note that str()
converts Unicode strings to bytes strings with the ascii
codec, so doesn't work with non-ASCII-only Unicode strings.
#!python2
#coding:utf8
import json
text = '''\
[
{
"id": "92cbb711-7e4d-417a-9530-f1850d9bc687",
"display": "관악저널.kr"
},
{
"id": "1060864a-a3a5-40c2-aa94-651fe2d10ae9",
"display": "010lm.com"
}
]'''
class Myobj(object):
def __init__(self, id, display):
self.id = id # or id.encode('utf8')
self.name = display # or display.encode('utf8')
def __repr__(self):
return 'MyObj({self.id!r},{self.name!r})'.format(self=self)
myobjs = json.loads(text, object_hook=lambda d: Myobj(**d))
print(myobjs)
Output:
[MyObj(u'92cbb711-7e4d-417a-9530-f1850d9bc687',u'\uad00\uc545\uc800\ub110.kr'),
MyObj(u'1060864a-a3a5-40c2-aa94-651fe2d10ae9',u'010lm.com')]
Upvotes: 1