A. Boy
A. Boy

Reputation: 47

Extract Only Certain Named Entities From Tokens

Quick question (hopefully). Is it possible for me to get the named entities of the tokens except for the ones with CARDINAL label (The label is 397). Here is my code below:

spacy_model = spacy.load('en-core-web-lg')
f = open('temp.txt')
tokens = spacy_model(f.read())
named_entities = tokens.ents #Except where named_entities.label = 397

Is this possible? Any help would be greatly appreciated.

Upvotes: 1

Views: 1111

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626802

You can filter out the entities using list comprehension:

named_entities = [t for t in tokens.ents if t.label_ != 'CARDINAL']

Here is a test:

import spacy
nlp = spacy.load("en_core_web_sm")
tokens = nlp('The basket costs $10. I bought 6.')
print([(ent.text, ent.label_) for ent in tokens.ents])
# => [('10', 'MONEY'), ('6', 'CARDINAL')]
print([t for t in tokens.ents if t.label_ != 'CARDINAL'])
# => [10]

Upvotes: 2

Related Questions