Reputation: 505
I'm writing a python code that will grab the value of RANK if the age is above 20. However, take a look at my JSON file:
[
{
"Age" : 22
"Rank": 100
}
{
"Age" : 64
"Rank": "20"
}
{
"Age" : 19
"Rank": 10
}
.
.
.
]
The actual database is longer than this example; it reaches hundred thousand. Noticed that some of the data are messed up because some Rank are written as String.
What should I do to grab all the Rank value without a problem? Do I need to make another script that trims the quotation mark, if exist, in Rank value?
Edit: My Python code
# assume 'file' is a file that I passed in argument during code execution
thefile = open(file)
thedata = json.load(thefile, encoding="latin1")
myContainer = []
for person in thedata:
if person["Age"] > 20:
myContainer.append(person["Rank"])
# Here is the issue why I can't let Rank be String
print sum(myContainer)/ len(myContainer)
UPDATE**
My expected output is [20, 10] instead of [u'20, 10].
Then when I average it, it should print a number instead of error.
Upvotes: 0
Views: 137
Reputation: 74
You could do this on one line, but for the purpose of making this simple to understand this works just fine.
newList = []
for item in yourList:
if int(item["age"]) > 20:
newlist.append(int(item["rank"]))
print newList
Upvotes: 2
Reputation: 21609
A couple of steps here, you could merge them if you wanted. It's not strictly required to remap the list to have int versions of Rank
up front.
First step is to make all instances of Rank
be an int value, by replacing the strings in those values with their int equivalent.
Then filter out any values with an Age
that is not above 20.
Now you have a reduced list of the original data, so map that to contain just the Rank
.
import json
from operator import itemgetter
data = '''[
{
"Age" : 22,
"Rank": 100
},
{
"Age" : 64,
"Rank": "20"
},
{
"Age" : 19,
"Rank": 10
}
]'''
def paramtoint(param):
def fn(d):
d[param] = int(d[param])
return d
return fn
fixed = map(paramtoint('Rank'), json.loads(data))
over20 = filter(lambda x: x['Age'] > 20 , fixed)
print(map(itemgetter('Rank'), over20))
Output
[100, 20]
Upvotes: 1
Reputation: 1594
ranklist = [int(_["rank"]) for _ in array if 20 < _["age"]]
where array is a python list of dicts parsed from the json file shown in the question.
Upvotes: 0