Reputation: 539
Here‘s a simplified example of some data I have:
{"id": "1234565", "fields": {"name": "john", "email":"[email protected]", "country": "uk"}}
The wholeo nested dictionary is a bigger list of address data. The goal is to create pairs of people from the list with randomized partners where partners from the same country should be preferd. So my first real issue is to find a good way to group them by that country value.
I‘m sure there‘s a smarter way to do this than iterating through the dict and writing all records out to some new list/dict?
Upvotes: 0
Views: 1500
Reputation: 1002
Here is another one that uses defaultdict:
import collections
def make_groups(nested_dicts, nested_key):
default = collections.defaultdict(list)
for nested_dict in nested_dicts:
for value in nested_dict.values():
try:
default[value[nested_key]].append(nested_dict)
except TypeError:
pass
return default
To test the results:
import random
COUNTRY = {'af', 'br', 'fr', 'mx', 'uk'}
people = [{'id': i, 'fields': {
'name': 'name'+str(i),
'email': str(i)+'@email',
'country': random.sample(COUNTRY, 1)[0]}}
for i in range(10)]
country_groups = make_groups(people, 'country')
for country, persons in country_groups.items():
print(country, persons)
Random output:
fr [{'id': 0, 'fields': {'name': 'name0', 'email': '0@email', 'country': 'fr'}}, {'id': 1, 'fields': {'name': 'name1', 'email': '1@email', 'country': 'fr'}}, {'id': 4, 'fields': {'name': 'name4', 'email': '4@email', 'country': 'fr'}}]
br [{'id': 2, 'fields': {'name': 'name2', 'email': '2@email', 'country': 'br'}}, {'id': 8, 'fields': {'name': 'name8', 'email': '8@email', 'country': 'br'}}]
uk [{'id': 3, 'fields': {'name': 'name3', 'email': '3@email', 'country': 'uk'}}, {'id': 7, 'fields': {'name': 'name7', 'email': '7@email', 'country': 'uk'}}]
af [{'id': 5, 'fields': {'name': 'name5', 'email': '5@email', 'country': 'af'}}, {'id': 9, 'fields': {'name': 'name9', 'email': '9@email', 'country': 'af'}}]
mx [{'id': 6, 'fields': {'name': 'name6', 'email': '6@email', 'country': 'mx'}}]
Upvotes: 0
Reputation: 8921
I think this is close to what you need:
result = {key:[i for i in value] for key, value in itertools.groupby(people, lambda item: item["fields"]["country"])}
What this does is use itertools.groupby
to group all people in the people
list by their specified country. The resulting dictionary has countries as keys, and the unpacked groupings (matching people) as values. Input is expected as a list of dictionaries like the one in your example:
people = [{"id": "1234565", "fields": {"name": "john", "email":"[email protected]", "country": "uk"}},
{"id": "654321", "fields": {"name": "sam", "email":"[email protected]", "country": "uk"}}]
Sample output:
>>> print(result)
>>> {'uk': [{'fields': {'name': 'john', 'email': '[email protected]', 'country': 'uk'}, 'id': '1234565'}, {'fields': {'name': 'sam', 'email': '[email protected]', 'country': 'uk'}, 'id': '654321'}]}
For a cleaner result, the looping construct can be tweaked so that only the ID of each person is included in the result dict:
result = {key:[i["id"] for i in value] for key, value in itertools.groupby(people, lambda item: item["fields"]["country"])}
>>> print(result)
>>> {'uk': ['1234565', '654321']}
EDIT: Sorry, I forgot about the sorting. Simply sort the list of people by country before putting it through groupby
. It should now work properly:
sort = sorted(people, key=lambda item: item["fields"]["country"])
Upvotes: 3