Reputation: 125
Given a list of dictionaries:
players= [
{ "name": 'matt', 'school': 'WSU', 'homestate': 'CT', 'position': 'RB' },
{ "name": 'jack', 'school': 'ASU', 'homestate': 'AL', 'position': 'QB' },
{ "name": 'john', 'school': 'WSU', 'homestate': 'MD', 'position': 'LB' },
{ "name": 'kevin', 'school': 'ALU', 'homestate': 'PA', 'position': 'LB' },
{ "name": 'brady', 'school': 'UM', 'homestate': 'CA', 'position': 'QB' },
]
How do I group them into groups by matching their matching dictionary values, such that it spews out:
Matching Value 1:
name: [matt, john, kevin],
school: [WSU, WSU, ALU],
homestate: [CT, MD, PA]
position: [RB, LB, LB]
Matching Value 2:
name: [jack, brady],
school: [ASU, UM],
homestate: [AL, CA]
position: [QB, QB]
Notice that the matching values are arbitrary; that is, it can be found anywhere. Maybe its in school
or in position
, or maybe in both.
I tried grouping them by doing:
from collections import defaultdict
result_dictionary = {}
for i in players:
for key, value in i.items():
result_dictionary.setdefault(key, []).append(value)
Which gives out:
{'name': ['matt', 'jack', 'john', 'kevin', 'brady'],
'school': ['WSU', 'ASU', 'WSU', 'ALU', 'UM'],
'homestate': ['CT', 'AL', 'MD', 'PA', 'CA'],
'position': ['RB', 'QB', 'LB', 'QB', 'QB']}
But I'm stuck on how do I further manipulate the output to match the required output I stated above, and I am sure there are better, simpler approach in doing it.
Upvotes: 4
Views: 112
Reputation: 71451
You can find the most common occurring header value and use the latter value as a focal point for further grouping:
import itertools
players= [
{ "name": 'matt', 'school': 'WSU', 'homestate': 'CT', 'position': 'RB' },
{ "name": 'jack', 'school': 'ASU', 'homestate': 'AL', 'position': 'QB' },
{ "name": 'john', 'school': 'WSU', 'homestate': 'MD', 'position': 'LB' },
{ "name": 'kevin', 'school': 'ALU', 'homestate': 'PA', 'position': 'S' },
{ "name": 'brady', 'school': 'UM', 'homestate': 'CA', 'position': 'QB' },
]
headers = ['name', 'school', 'homestate', 'position']
final_header = [[a, max(b, key=lambda x:b.count(x))] for a, b in zip(headers, zip(*[[i[b] for b in headers] for i in players])) if len(set(b)) < len(b)]
d = [[list(b) for _, b in itertools.groupby(filter(lambda x:x[i] == c, players), key=lambda x:x[i])][0] for i, c in final_header]
last_results = {'pattern {}'.format(i):{d[0][0]:[j[-1] for j in d] for c, d in zip(headers, zip(*map(dict.items, h)))} for i, h in enumerate(d, start=1)}
Output:
{'pattern 2':
{'homestate': ['AL', 'CA'],
'school': ['ASU', 'UM'],
'name': ['jack', 'brady'],
'position': ['QB', 'QB']},
'pattern 1':
{'homestate': ['CT', 'MD'],
'school': ['WSU', 'WSU'],
'name': ['matt', 'john'],
'position': ['RB', 'LB']}
}
Upvotes: 1
Reputation: 95948
Just use collections.defaultdict
that you already imported:
In [21]: from collections import defaultdict
...: result = defaultdict(lambda: defaultdict(list))
...: for d in players:
...: for k,v in d.items():
...: result[d['school']][k].append(v)
...:
In [22]: result
Out[22]:
defaultdict(<function __main__.<lambda>>,
{'ASU': defaultdict(list,
{'homestate': ['AL'],
'name': ['jack'],
'position': ['QB'],
'school': ['ASU']}),
'WSU': defaultdict(list,
{'homestate': ['CT', 'MD'],
'name': ['matt', 'john'],
'position': ['RB', 'LB'],
'school': ['WSU', 'WSU']})})
Upvotes: 3