ThomasWest
ThomasWest

Reputation: 505

Trim Double Quote from JSON value using Python

I'm writing a python code that will grab the value of RANK if the age is above 20. However, take a look at my JSON file:

[
    {
         "Age" : 22
         "Rank": 100
    }
    {
         "Age" : 64
         "Rank": "20"
    }
    {
         "Age" : 19
         "Rank": 10
    }
    .
    .
    .
]

The actual database is longer than this example; it reaches hundred thousand. Noticed that some of the data are messed up because some Rank are written as String.

What should I do to grab all the Rank value without a problem? Do I need to make another script that trims the quotation mark, if exist, in Rank value?

Edit: My Python code

# assume 'file' is a file that I passed in argument during code execution
thefile = open(file)
thedata = json.load(thefile, encoding="latin1")

myContainer = []

for person in thedata:
    if person["Age"] > 20:
        myContainer.append(person["Rank"])

# Here is the issue why I can't let Rank be String
print sum(myContainer)/ len(myContainer)

UPDATE**

My expected output is [20, 10] instead of [u'20, 10].

Then when I average it, it should print a number instead of error.

Upvotes: 0

Views: 137

Answers (3)

You could do this on one line, but for the purpose of making this simple to understand this works just fine.

newList = []

for item in yourList:

    if int(item["age"]) > 20:
        newlist.append(int(item["rank"]))

print newList

Upvotes: 2

Paul Rooney
Paul Rooney

Reputation: 21609

A couple of steps here, you could merge them if you wanted. It's not strictly required to remap the list to have int versions of Rank up front.

First step is to make all instances of Rank be an int value, by replacing the strings in those values with their int equivalent.

Then filter out any values with an Age that is not above 20.

Now you have a reduced list of the original data, so map that to contain just the Rank.

import json
from operator import itemgetter

data = '''[
    {
        "Age" : 22,
        "Rank": 100
    },
    {
        "Age" : 64,
        "Rank": "20"
    },
    {
        "Age" : 19,
        "Rank": 10
    }
]'''

def paramtoint(param):
    def fn(d):
        d[param] = int(d[param])
        return d
    return fn

fixed = map(paramtoint('Rank'), json.loads(data))

over20 = filter(lambda x: x['Age'] > 20 , fixed)

print(map(itemgetter('Rank'), over20))

Output

[100, 20]

Upvotes: 1

Tom Barron
Tom Barron

Reputation: 1594

ranklist = [int(_["rank"]) for _ in array if 20 < _["age"]]

where array is a python list of dicts parsed from the json file shown in the question.

Upvotes: 0

Related Questions