Erwin
Erwin

Reputation: 381

combine nested dictionaries in list of nested dictionaries based on matching key:value pair

I tried to google and find a problem which is very similar to my use case here: combine dictionaries in list of dictionaries based on matching key:value pair. But it seems that it did not correspond 100% to my case as I have list of nested dictionaries. Suppose that I have a list of nested dictionaries (more than 2) but in this case I considered two nested dictionaries to make the example:

my_list = [{'sentence': ['x',
   'ray',
   'diffractometry',
   'has',
   'been',
   'largely',
   'used',
   'thanks',
   'to',
   ],
  'mentions': [{'mention': [27, 28],
    'positives': [26278, 27735, 21063],
    'negatives': [],
    'entity': 27735}]},
 {'sentence': ['x',
   'ray',
   'diffractometry',
   'has',
   'been',
   'largely',
   'used',
   'thanks',
   'to',
   ],
  'mentions': [{'mention': [13, 14],
    'positives': [7654],
    'negatives': [],
    'entity': 7654}]}]

How can I merge these two dictionaries based on the matching of key(sentence) and value(list of all tokens) So that I can get the desired result as below:

my_new_list = [
{'sentence': ['x',
   'ray',
   'diffractometry',
   'has',
   'been',
   'largely',
   'used',
   'thanks',
   'to',
   ],
  'mentions': [
    {'mention': [27, 28],
    'positives': [26278, 27735, 21063],
    'negatives': [],
    'entity': 27735
    },
   {'mention': [13, 14],
    'positives': [7654],
    'negatives': [],
    'entity': 7654
     }
   ]
}
]

How to merge the list of key "mentions" when matching the key(sentence):value(list of all tokens)? In my actual list, there will be a lot of dictionaries with the same style.

Many thanks for your help.

Upvotes: 0

Views: 893

Answers (2)

log0
log0

Reputation: 10917

From what I understand you want to group information by "sentence".

You can do this by iterating on your array and fill a dictionary of list indexed by sentence.

Something like:

from collections import defaultdict
sentences = defaultdict(list)
for element in my_list:
   key = tuple(element["sentence"])
   sentences[key].append(element)

this gives you

 { sentence1: [element1, element2], sentence2: [element3] }

From there should be able to easily construct the structure you want.

edit removed reference to specific fields

Upvotes: 0

Amith Lakkakula
Amith Lakkakula

Reputation: 516

my_dict = {}
for row in my_list:
    key = ' '.join(row['sentence']) # use sentence as key
    if key in my_dict:
        my_dict[key]['mentions'].extend(row['mentions'])
    else:
        my_dict[key] = row
        
my_list = list(my_dict.values())

Upvotes: 1

Related Questions