Reputation: 664
I have a list of dictionaries containing text:
list_dicts = [{'id': 1, 'text': 'hello my name is Carla'}, {'id': 2, 'text': 'hello my name is John' }]
I applied Spacy named entity recognition on the nested texts like so:
for d in list_dicts:
for k,v in d.items():
if k=='text':
doc = nlp(v)
for ent in doc.ents:
print([ent.text, ent.label_])
The output is a printout of the named entity text and its corresponding label, for example:
['Bob', 'PERSON']
['John', 'PERSON']
I would like to add the named entities to their corresponding text in each nested dictionary,which would look like this:
list_dicts = [{'id': 1, 'text': 'hello our names are Carla and Bob', 'entities':[['Carla', 'PERSON'], ['Bob':'PERSON']]}, {'id': 2, 'text': 'hello my name is John', 'entities': [['John', 'PERSON']] }]
As for now, I attempted to implement zip() as a method for linking the entities to the original text and later convert these to a new list of dictionaries, but it seems zip() does not work with the Spacy objects.
Upvotes: 0
Views: 724
Reputation: 82785
Using dict.setdefault
Ex:
for d in list_dicts:
doc = nlp(d['text'])
for ent in doc.ents:
d.setdefault('entities', []).append([ent.text, ent.label_])
Upvotes: 1