Reputation: 536
I am working on an NLP project and I need the following functionality illustrated by an example. Say there is a sentence
Tell Sam that he will have to leave without Arthur, as he is sick.
In this statement, the first he
has to be tagged to Sam and the second he
to Arthur. I work in Python. Any suggestions on what I can use to get the following functionality?
Upvotes: 5
Views: 2745
Reputation: 130
Like others suggested, this is coreference resolution which is an NLP active research topic.
Try the following code from huggingface(spacy):
import spacy
nlp = spacy.load('en')
import neuralcoref
neuralcoref.add_to_pipe(nlp,greedyness=0.52)
doc = nlp("Tell Sam that he will have to leave without Arthur, as he is sick.")
print(doc._.coref_resolved)
You can adjust the greedyness of the algo to get more resolutions(replacements of pronouns). Keep in mind that increasing greedyness might give you incorrect resolutions, it will depend on your use case.
Upvotes: 3
Reputation: 2079
Update:
There are now Python native tools with coreference resolution, such as:
AllenNLP by AllenAI.
Huggingface which is almost a spaCy expansion.
StanfordNLP by Stanford.
These references were mainly retrieved from this nice RASA (a NLU based chatbot solution) tutorial: https://github.com/RasaHQ/tutorial-knowledge-base
Upvotes: 3
Reputation: 5560
This task is called coreference resolution. In order to parse complex cases like the one you mention, you'd need to use a coreference resolution system, most of which (free/OOS) are developed in Java. There are several ways to easily use them from Python. One of the most well-know is this Standford CoreNLP wrapper: https://github.com/dasmith/stanford-corenlp-python
Upvotes: 7