Reputation: 91
I have a JSON file made up of an array which contains dictionaries, each dictionary is an opinion of a buyer on a specific garage. I want to find out how many occurrence of each car's type I have in each garage, it looks like this:
[
{"garage": "mike_gar", "reliability": 6, "car_type": "ford", "time": "16:10:36"},
{"garage": "bill_gar", "reliability": 5,"car_type": "kia", "time": "4:37:22"},
{"garage": "alison_gar", "reliability": 1, "car_type": "kia", "time": "11:25:40"},
{"garage": "alison_gar", "reliability": 10, "car_type": "mazda", "time": "2:18:42"},
{"garage": "mike_gar", "reliability": 3, "car_type": "mazda", "time": "12:14:20"},
{"garage": "mike_gar", "reliability": 2, "car_type": "ford", "time": "2:08:27"}
]
Assuming we already read from the JSON file to a variable g_arr. I'v tried to use reduce() to count the occurrence , but failed to succeed.
output example: {"garage" : "mike_gar", "types":{"ford" : 2, "mazda": 1}}
Upvotes: 1
Views: 102
Reputation: 17794
Pandas package is great for working with a such data. You can easily convert your list into a Pandas dataframe.
import pandas as pd
df = pd.DataFrame(g_arr)
print(df)
Prints:
car_type garage reliability time
0 ford mike_gar 6 16:10:36
1 kia bill_gar 5 4:37:22
2 kia alison_gar 1 11:25:40
3 mazda alison_gar 10 2:18:42
4 mazda mike_gar 3 12:14:20
5 ford mike_gar 2 2:08:27
Than you can use the .groupby()
method to group your data and the .size()
method to get row counts per group.
print(df.groupby(['garage', 'car_type']).size())
Prints:
garage car_type
alison_gar kia 1
mazda 1
bill_gar kia 1
mike_gar ford 2
mazda 1
dtype: int64
Upvotes: 1
Reputation: 56865
Here's a solution based on reduction. First, I test whether the garage exists in the accumulation dictionary, and if not, create it. Then, I check whether the car type exists in the garage dictionary, and if not, I create it. Finally, I increment the car type.
res = {}
for d in garages:
if d["garage"] not in res:
res[d["garage"]] = {"garage": d["garage"], "types": {}}
if d["car_type"] not in res[d["garage"]]["types"]:
res[d["garage"]]["types"][d["car_type"]] = 0
res[d["garage"]]["types"][d["car_type"]] += 1
Output:
{
'mike_gar': {'garage': 'mike_gar', 'types': {'ford': 2, 'mazda': 1}},
'bill_gar': {'garage': 'bill_gar', 'types': {'kia': 1}},
'alison_gar': {'garage': 'alison_gar', 'types': {'kia': 1, 'mazda': 1}}
}
If you'd like your result in an array, use res.values()
.
Upvotes: 1
Reputation: 948
You can simply parse your data and do the count as the following:
garages = []
cars = []
output = []
for element in data:
if element['garage'] not in garages: garages.append(element['garage'])
if element['car_type'] not in cars: cars.append(element['car_type'])
for type in garages:
current = {}
current['types'] = {}
current['garage'] = type
for element in data:
if element['car_type'] not in current['types']:
current['types'][element['car_type']]=0
if current['garage'] == element['garage']:
for car_type in cars:
if element['car_type'] == car_type:
current['types'][element['car_type']]+=1
output.append(current)
print output
the output of executing the above is:
[{'garage': 'mike_gar', 'types': {'mazda': 1, 'kia': 0, 'ford': 2}}, {'garage': 'bill_gar', 'types': {'mazda': 0, 'kia': 1, 'ford': 0}}, {'garage': 'alison_gar', 'types': {'mazda': 1, 'kia': 1, 'ford': 0}}]
Upvotes: 1