Chris
Chris

Reputation: 131

Stanford Core NLP how to get the probability & margin of error

When using the parser or for the matter any of the Annotation in Core NLP, is there a way to access the probability or the margin of error?

I am particularly interested in detecting ambiguity programmatically. For example, in the sentence below, desire is tagged as a noun, but it could also be a verb.

I want to know if there is a way to retrieve a confidence score from the CoreNLP API that indicates ambiguity.

(NP (NP (NNP Whereas)) (, ,) (NP (NNP users) (NN desire) (S (VP (TO to) (VP (VB sell))))))

In this case, desire is labeled as NN (noun) instead of a verb. I need a way to check how confident CoreNLP is about this classification.

Upvotes: 13

Views: 654

Answers (1)

bsraskr
bsraskr

Reputation: 615

Stanford CoreNLP does not provide direct ambiguity scores, but you can detect ambiguity programmatically using these methods:

1. POS Tagging Probability Use the MaxentTagger in CoreNLP to get log-probabilities for POS tags. Lower probabilities indicate higher ambiguity.

2. Multiple Parses Use n-best parsing with the Stanford Parser to check if alternative parses exist for the sentence. If "desire" appears with different POS tags in multiple parses, it's ambiguous.

3. Compare Multiple Taggers – Run CoreNLP alongside Stanza, SpaCy, or Flair and compare POS tags. Different outputs suggest ambiguity.

4. Dependency Parsing Checks If "desire" is marked as a noun but functions as a verb in dependencies, ambiguity exists.

For real-time ambiguity detection, combining log-probabilities, multiple parses, and external taggers is the best approach.

Upvotes: 0

Related Questions