Reputation: 381
I tried to google and find a problem which is very similar to my use case here: combine dictionaries in list of dictionaries based on matching key:value pair. But it seems that it did not correspond 100% to my case as I have list of nested dictionaries. Suppose that I have a list of nested dictionaries (more than 2) but in this case I considered two nested dictionaries to make the example:
my_list = [{'sentence': ['x',
'ray',
'diffractometry',
'has',
'been',
'largely',
'used',
'thanks',
'to',
],
'mentions': [{'mention': [27, 28],
'positives': [26278, 27735, 21063],
'negatives': [],
'entity': 27735}]},
{'sentence': ['x',
'ray',
'diffractometry',
'has',
'been',
'largely',
'used',
'thanks',
'to',
],
'mentions': [{'mention': [13, 14],
'positives': [7654],
'negatives': [],
'entity': 7654}]}]
How can I merge these two dictionaries based on the matching of key(sentence) and value(list of all tokens) So that I can get the desired result as below:
my_new_list = [
{'sentence': ['x',
'ray',
'diffractometry',
'has',
'been',
'largely',
'used',
'thanks',
'to',
],
'mentions': [
{'mention': [27, 28],
'positives': [26278, 27735, 21063],
'negatives': [],
'entity': 27735
},
{'mention': [13, 14],
'positives': [7654],
'negatives': [],
'entity': 7654
}
]
}
]
How to merge the list of key "mentions" when matching the key(sentence):value(list of all tokens)? In my actual list, there will be a lot of dictionaries with the same style.
Many thanks for your help.
Upvotes: 0
Views: 893
Reputation: 10917
From what I understand you want to group information by "sentence".
You can do this by iterating on your array and fill a dictionary of list indexed by sentence.
Something like:
from collections import defaultdict
sentences = defaultdict(list)
for element in my_list:
key = tuple(element["sentence"])
sentences[key].append(element)
this gives you
{ sentence1: [element1, element2], sentence2: [element3] }
From there should be able to easily construct the structure you want.
edit removed reference to specific fields
Upvotes: 0
Reputation: 516
my_dict = {}
for row in my_list:
key = ' '.join(row['sentence']) # use sentence as key
if key in my_dict:
my_dict[key]['mentions'].extend(row['mentions'])
else:
my_dict[key] = row
my_list = list(my_dict.values())
Upvotes: 1