What are the ways of Key-Value extraction from unstructured text?

Question

I'm trying to figure out what are the ways (and which of them the best one) of extraction of Values for predefined Keys in the unstructured text?

Input:

The doctor prescribed me a drug called favipiravir.
His name is Yury.
Ilya has already told me about that.
The weather is cold today.
I am taking a medicine called nazivin.

Key list: ['drug', 'name', 'weather']

Output:

['drug=favipiravir', 'drug=nazivin', 'name=Yury', 'weather=cold']

So, as you can see, in the 3d sentence there is no explicit key 'name' and therefore no value extracted (I think there is the difference with NER). At the same time, 'drug' and 'medicine' are synonyms and we should treat 'medicine' as 'drug' key and extract the value also.

And the next question, what if the key set will be mutable? Should I use as a base regexp approach because of predefined Keys or there is a way to implement it with supervised learning/NN? (but in this case how to deal with mutable keys?)

What are the ways of Key-Value extraction from unstructured text?

Answers (1)

Related Questions