User12345
User12345

Reputation: 5480

better way to create pure python lists from JSON files in python

I have a JSON file in Python. File contents are below.

{
    "cities": [
        "NY",
        "SFO",
        "LA",
        "NJ"
    ],
    "companies": [
        "Apple",
        "Samsung",
        "Walmart"
    ],
    "devices": [
        "iphone",
        "ipad",
        "ipod",
        "watch"
    ]
}

I want to create Python lists from this JSON file. I have done like below.

# Open JSON file in Python 
with open('test.json') as out_file:
  test_data = json.load(out_file)

# Query the output variable test_data 
test_data
{u'cities': [u'NY', u'SFO', u'LA', u'NJ'], u'companies': [u'Apple', u'Samsung', u'Walmart'], u'devices': [u'iphone', u'ipad', u'ipod', u'watch']}

# find type of test_data
type(test_data)
<type 'dict'>

# create list from test_data
device = test_data['devices']

# Check content of list created
device
[u'iphone', u'ipad', u'ipod', u'watch']

Now as you see the list is a unicode list I want it to be a pure Python list.

I can do like below

device_list = [str(x) for x in device]
device_list
['iphone', 'ipad', 'ipod', 'watch']

Is there a better way to do this?

Upvotes: 0

Views: 84

Answers (3)

ak_slick
ak_slick

Reputation: 1016

I think if you change the json.load to json.loads it will fix your issue. Removing any need to map.

Try this.

import jason
import yaml


f = open('temp.json', 'r')
json_str = f.read()

content = json.loads(json_str)

# this should remove all the unicode and return a dictionary
content = yaml.load(json.dumps(content))

content
{'cities': ['NY', 'SFO', 'LA', 'NJ'], 'companies': ['Apple', 'Samsung', 'Walmart'], 'devices': ['iphone', 'ipad', 'ipod', 'watch']}

content['devices']
['iphone', 'ipad', 'ipod', 'watch']

Upvotes: 1

chepner
chepner

Reputation: 530960

The reason you get back a list of unicode objects is that JSON uses Unicode. For plain ASCII strings, it would be sufficient to simply call str, but for "real" Unicode, you need to encode them first.

>>> [str(x) for x in json.loads(u'["foo"]')]
['foo']

>>> [str(x) for x in json.loads(u'["föö"]')]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-2: ordinal not in range(128)

>>> [x.encode('utf8') for x in json.loads(u'["föö"]')]
['f\xc3\xb6\xc3\xb6']

Upvotes: 1

Rakesh
Rakesh

Reputation: 82755

One approach is to use map

Ex:

l = [u'iphone', u'ipad', u'ipod', u'watch']
print(map(str, l))

python3

print(list(map(str, l)))

Output:

['iphone', 'ipad', 'ipod', 'watch']

Unicode or regular string does not make much difference

Upvotes: 1

Related Questions