user14083750
user14083750

Reputation:

How to remove an entity from a sentence with spaCy?

How to remove an entity from a sentence with spaCy? I want to remove ORP, GPE, Money, Ordinal, or Percent entity randomly. For example,

Donald John Trump[person] (born June 14, 1946)[date] is the 45th[ordinal] and current president of the United States[GPE]. Before entering politics, he was a businessman and television personality.

Now how can I remove a certain entity form this sentence? In this example, the function chose to remove 45th, an ordinal entity.

>>> sentence = 'Donald John Trump (born June 14, 1946) is the 45th and current president of the United States. Before entering politics, he was a businessman and television personality.'
>>> remove(sentence)
45th

Upvotes: 1

Views: 915

Answers (1)

Sergey Bushmanov
Sergey Bushmanov

Reputation: 25249

Please try Spacy NER together with np.random.choice:

import spacy
nlp = spacy.load("en_core_web_md")

sentence = 'Donald John Trump (born June 14, 1946) is the 45th and current president of the United States. Before entering politics, he was a businessman and television personality.'
doc = nlp(sentence)

ents = [e.text for e in doc.ents if e.label_ in ("NORP", "GPE", "MONEY", "ORDINAL","PERCENT")]
remove = lambda x: str(np.random.choice(x))
# expected output
remove(ents)
'45th'

Should you wish to remove a random entity from sentence text:

def remove_from_sentence(sentence):
    doc = nlp(sentence)
    with doc.retokenize() as retokenizer:
        for e in doc.ents:
            retokenizer.merge(doc[e.start:e.end])
    tok_pairs = [(tok.text, tok.whitespace_) for tok in doc]
    ents = [e.text for e in doc.ents if e.label_ in ("NORP", "GPE", "MONEY", "ORDINAL","PERCENT")]
    ent_to_remove = remove(ents)
    print(ent_to_remove)
    tok_pairs_out = [pair for pair in tok_pairs if pair[0] != ent_to_remove]
    return "".join(np.array(tok_pairs_out).ravel())

remove_from_sentence(sentence)

the United States
'Donald John Trump (born June 14, 1946) is the 45th and current president of . Before entering politics, he was a businessman and television personality.'

Please ask if something is not clear.

Upvotes: 5

Related Questions