blah
blah

Reputation: 664

Append to a nested list in a list of dicts under conditions

I have 2 lists that share information. First, I want to have a unique set of names (e.g.list_person has repeated name values); For this I produce a new list of dictionaries. Then, I want to add/append list_pets['pet'] to the correct list_person['pets'] in the new dictionary with unique name values, when the list_pets['person_id'] matches the list_person['id'].

For clarification here is my code + desired output:

My current code:

list_person = [{'id': 12345, 'name': 'Bobby Bobs', 'pets': ['cat']}, # you see that name values are repeated
              {'id': 678910, 'name': 'Bobby Bobs', 'pets': ['zebra']},
              {'id': 111213, 'name': 'Lisa Bobs', 'pets': ['horse']},
              {'id': 141516, 'name': 'Lisa Bobs', 'pets': ['rabbit']}]

list_pets = [{'id': 'abcd', 'pet': 'shark', 'person_id': 12345}, #Bobby Bobs' pets
             {'id': 'efgh', 'pet': 'tiger', 'person_id': 678910}, #Bobby Bobs' pets
             {'id': 'ijkl', 'pet': 'elephant', 'person_id': 111213}, #Lisa Bobs' pets
             {'id': 'mnopq', 'pet': 'dog', 'person_id': 141516}] #Lisa Bobs' pets

output = []
for person, pet in zip(list_person, list_pets):
    t = [temp_dict['name'] for temp_dict in output]
    if person['name'] not in t:
        output.append(person)    # make a new list of dicts with unique name values
        for unique_person in output: # if they share ID, add the missing pets. 
            if person['id'] == pet['person_id']:
                unique_person['pets'].append(pet['pet'])
print(output)

Desired output:

desired_out = [{'id': 12345, 'name': 'Bobby Bobs', 'pets': ['cat', 'zebra', 'shark', 'tiger']},
                {'id': 111213, 'name': 'Lisa Bobs', 'pets': ['horse', 'rabbit', 'elephant', 'dog']}]

Current output:

[{'id': 12345, 'name': 'Bobby Bobs', 'pets': ['cat', 'shark', 'elephant']}, {'id': 111213, 'name': 'Lisa Bobs', 'pets': ['horse', 'elephant']}]

My current output is not displaying all the correct pets. Why is that; and what advice would one give to me to get closer to the solution?

Upvotes: 3

Views: 79

Answers (2)

aneroid
aneroid

Reputation: 15962

Here's a non-pandas solution, and it doesn't rely on an order-relation between list_person (aka 'people') and list_pets. So I'm not assuming that Bobby's data is the first two entries in both lists.

Initially, output will be a mapping on names to the person's data, incl pets. And ids will be maintained to link each person's different IDs - by intentionally using a reference to the data dict and not a copy.

Note that when a person is added to output, it is done as a deepcopy so that it doesn't affect the original item in list_person.

import copy

output = {}  # dict, not list
ids = {}  # needed to match with pets which has person_id

for person in list_person:
    if (name := person['name']) in output:
        output[name]['pets'].extend(person['pets'])
        output[name]['id'].append(person['id'])
        ids[person['id']] = output[name]  # itentionally a reference, not a copy
    else:
        output[name] = copy.deepcopy(person)  # so that the pet list is created as a copy
        output[name]['id'] = [output[person['name']]['id']]  # turn id's into a list
        ids[person['id']] = output[name]  # itentionally a reference, not a copy

for pet in list_pets:
    # the values in ids dict can be references to the same object
    # so use that to our advantage by directly appending to 'pet' list
    ids[pet['person_id']]['pets'].append(pet['pet'])

output is now:

{'Bobby Bobs': {'id': [12345, 678910],
                'name': 'Bobby Bobs',
                'pets': ['cat', 'zebra', 'shark', 'tiger']},
 'Lisa Bobs': {'id': [111213, 141516],
               'name': 'Lisa Bobs',
               'pets': ['horse', 'rabbit', 'elephant', 'dog']}
}

Final step to make it a list and only use one id for each person:

output = list(output.values())
for entry in output:
    entry['id'] = entry['id'][0]  # just the first id

Final output:

[{'id': 12345,
  'name': 'Bobby Bobs',
  'pets': ['cat', 'zebra', 'shark', 'tiger']},
 {'id': 111213,
  'name': 'Lisa Bobs',
  'pets': ['horse', 'rabbit', 'elephant', 'dog']}]

And if you don't mind multiple ids, skip the last step above and leave it at output = list(output.values()).

Upvotes: 1

Amit Vikram Singh
Amit Vikram Singh

Reputation: 2128

import itertools
person_df = pd.DataFrame(list_person)
pets_df = pd.DataFrame(list_pets).drop(columns = ['id'])
joined_df = person_df.merge(pets_df, left_on = ['id'], right_on = ['person_id'])

Joined df:

>>> joined_df
       id        name               pets       pet  person_id
0   12345  Bobby Bobs       [cat, shark]     shark      12345
1  678910  Bobby Bobs     [zebra, tiger]     tiger     678910
2  111213   Lisa Bobs  [horse, elephant]  elephant     111213
3  141516   Lisa Bobs      [rabbit, dog]       dog     141516

Now first combine pets and pet columns then groupby on name

joined_df['pets'] = [pets + [pet] for pets, pet in zip(joined_df['pets'], joined_df['pet'])]
final_list = joined_df.groupby('name', as_index = False).agg(
                                  id = ('id', 'first'), 
                                  pets = ('pets', lambda x: list(itertools.chain(*x)))
                                ).to_dict('records')

Output:

>>> final_list
 [{'name': 'Bobby Bobs', 'id': 12345, 'pets': ['cat', 'shark', 'zebra', 'tiger']}, 
{'name': 'Lisa Bobs', 'id': 111213, 'pets': ['horse', 'elephant', 'rabbit', 'dog']}]

Upvotes: 1

Related Questions