Reputation: 5480
I have a JSON
file in Python
. File contents are below.
{
"cities": [
"NY",
"SFO",
"LA",
"NJ"
],
"companies": [
"Apple",
"Samsung",
"Walmart"
],
"devices": [
"iphone",
"ipad",
"ipod",
"watch"
]
}
I want to create Python
lists from this JSON
file. I have done like below.
# Open JSON file in Python
with open('test.json') as out_file:
test_data = json.load(out_file)
# Query the output variable test_data
test_data
{u'cities': [u'NY', u'SFO', u'LA', u'NJ'], u'companies': [u'Apple', u'Samsung', u'Walmart'], u'devices': [u'iphone', u'ipad', u'ipod', u'watch']}
# find type of test_data
type(test_data)
<type 'dict'>
# create list from test_data
device = test_data['devices']
# Check content of list created
device
[u'iphone', u'ipad', u'ipod', u'watch']
Now as you see the list is a unicode list
I want it to be a pure Python
list.
I can do like below
device_list = [str(x) for x in device]
device_list
['iphone', 'ipad', 'ipod', 'watch']
Is there a better way to do this?
Upvotes: 0
Views: 84
Reputation: 1016
I think if you change the json.load to json.loads it will fix your issue. Removing any need to map.
Try this.
import jason
import yaml
f = open('temp.json', 'r')
json_str = f.read()
content = json.loads(json_str)
# this should remove all the unicode and return a dictionary
content = yaml.load(json.dumps(content))
content
{'cities': ['NY', 'SFO', 'LA', 'NJ'], 'companies': ['Apple', 'Samsung', 'Walmart'], 'devices': ['iphone', 'ipad', 'ipod', 'watch']}
content['devices']
['iphone', 'ipad', 'ipod', 'watch']
Upvotes: 1
Reputation: 530960
The reason you get back a list of unicode
objects is that JSON uses Unicode. For plain ASCII strings, it would be sufficient to simply call str
, but for "real" Unicode, you need to encode them first.
>>> [str(x) for x in json.loads(u'["foo"]')]
['foo']
>>> [str(x) for x in json.loads(u'["föö"]')]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-2: ordinal not in range(128)
>>> [x.encode('utf8') for x in json.loads(u'["föö"]')]
['f\xc3\xb6\xc3\xb6']
Upvotes: 1
Reputation: 82755
One approach is to use map
Ex:
l = [u'iphone', u'ipad', u'ipod', u'watch']
print(map(str, l))
python3
print(list(map(str, l)))
Output:
['iphone', 'ipad', 'ipod', 'watch']
Unicode or regular string does not make much difference
Upvotes: 1