Praveen Kumar
Praveen Kumar

Reputation: 114

Named Entity Recognition using context of the sentence

I have a problem in which I want to know how can we extract or name the entity based on the context in which it is getting used in a sentence.

For example: If we have to extract date field which is used in the context of the date of birth only then how can we do that.

I know that we can use regular expression, spacy, NLTK to extract date field from a document. But I am unable to determine the approach to extract date based on the context in which it is getting used.

Example 1 : My birthday is on 9th December. Here 9th December will be marked as date field if we use spacy or regex, but I want it to be marked as a custom entity 'date of birth'. Example 2: I am going for a movie on 1st April. Here 1st April should be marked as normal date field.

Upvotes: 3

Views: 1677

Answers (1)

Jindřich
Jindřich

Reputation: 11213

Named entity recognition as defined only as marking contiguous segments of sentences and assigning them a label from a predefined set. Machine-learned recognizers (such as the one used by spacy) indeed use the context of the whole sentence, however, once the model is trained, you cannot add new labels such as 'date of birth'. If you have a large corpus where such entities are annotated, you can re-train the spacy model, so it is able to use your labels.

Maybe too heavy machinery would be using some knowledge extraction methods, which basically connect recognized entities and assigns them some semantic labels. In your case it would be something like: [PERSON] (was born on) [DATE].

Anyway, if the task you want to solve is as easy as re-labeling entity in a specific context, I would write a set of rules for the specific case. Something like: if the entity is date and there is 'born' or 'birth' in the sentence, it is your date-of-birth entity. Or you can make some more fancy rules based on dependency parsing that you get from spacy as well.

Upvotes: 0

Related Questions