Sayantan
Sayantan

Reputation: 335

Possible error with Stanford POS Tagger and classifying intent and the replies

I have a specific usecase, where a person would say something like this:

I would like to recognize the intent and the slots.

Then I use Stanford Parser to parse the sentence, e.g. parsing "Note in object history object was last updated in may twenty eighteen" gives this list-of-tuple:

[('Note', 'VB'),
 ('in', 'IN'),
 ('object', 'NN'),
 ('history', 'NN'),
 ('object', 'NN'),
 ('was', 'VBD'),
 ('last', 'RB'),
 ('updated', 'VBN'),
 ('in', 'IN'),
 ('may', 'MD'),
 ('twenty', 'CD'),
 ('eighteen', 'CD')]
  1. Now, my point is how can I use this information to get the necessary output:

    • Where to note (we have a field in DB: Object History) and
    • What to note (object was last updated in may twenty eighteen).
  2. Another issue is since the input of the NLP is from an ASR system, the capitalization is missing. And the POS Tagger mis-tags 'note' as 'NN' (instead of 'VB'). Ideally 'note'/'record' should be a verb. How do I solve this probable error?

Upvotes: 0

Views: 58

Answers (1)

StanfordNLPHelp
StanfordNLPHelp

Reputation: 8739

You can use the TrueCaseAnnotator to fix case issues:

https://stanfordnlp.github.io/CoreNLP/truecase.html

In general you probably just want to use TokensRegex and write rules patterns to handle these templates. More info here:

https://stanfordnlp.github.io/CoreNLP/tokensregex.html

Upvotes: 1

Related Questions