Dawud Sayeed
Dawud Sayeed

Reputation: 37

How to extract sentences from one text with only 1 named entity using spaCy?

I have a list of sentences and I want to be able to append only the sentences with 1 "PERSON" named entity using spaCy. The code I used was as follows:

test_list = []
for item in sentences: #for each sentence in 'sentences' list
  for ent in item.ents: #for each entity in the sentence's entities 
    if len(ent in item.ents) == 1: #if there is only one entity
      if ent.label_ == "PERSON": #and if the entity is a "PERSON"
        test_list.append(item) #put the sentence into 'test_list'

But then I get:

TypeError: object of type 'bool' has no len()

Am I doing this wrong? How exactly would I complete this task?

Upvotes: 1

Views: 1120

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626748

You get the error because ent in item.ents returns a boolean result, and you can't get its length.

What you want is

test_list = []
for item in sentences: #for each sentence in 'sentences' list
    if len(item.ents) == 1 and item.ents[0].label_ == "PERSON": #if there is only one entity and if the entity is a "PERSON"
        test_list.append(item) #put the sentence into 'test_list'

The len(item.ents) == 1 checks if there is only one entity detected in the sentence, and item.ents[0].label_ == "PERSON" makes sure the first entity lable text is PERSON.

Note the and operator, both conditions must be met.

Upvotes: 1

Related Questions