Michail Michailidis
Michail Michailidis

Reputation: 12201

Stanford NLP POS Tagger has issues with very simple phrases?

I found examples of inconsistent behavior in my application using Stanford NLP Parser/POS Tagger and I was able to replicate it online http://nlp.stanford.edu:8080/corenlp/process . I am using version 3.60:

Here are the 3 issues I have found so far:

NLP Stanford POS Tagger with and without dot

I know that language is fairly ambiguous but I would like to know if I can trust this library even for those simple phrases. I would like to also know if I am doing something wrong. I tried the problematic cases of each of an example alone or in other words in separate sentences and the problem persists.

This is the expected behavior:

enter image description here

Any help is appreciated! Thanks

Upvotes: 1

Views: 890

Answers (2)

stealthyK
stealthyK

Reputation: 11

The different results from POS taggers was driving me crazy so for sanity checks I finally wrote something to quickly compare results against the three to typically use (Stanford NLP, NLTK 3.2.1 and Senna) It also times them as often one tagger can choke on certain text. https://github.com/StealthyK/TaggerTimer

Upvotes: 1

Christopher Manning
Christopher Manning

Reputation: 9450

You're not doing anything wrong. You're of course welcome to decide for yourself how much to trust any tool, but I suspect you'll see similar issues with any parser trained empirically/statistically. As to your issues:

  • Periods are treated like any other token in model building, so, yes, they can influence the parse chosen.
  • There are indeed a lot of ambiguities in English (as there are in all other human languages), and the question of whether to interpret forms ending in ing as verbs, nouns (verbal nouns or gerunds), or adjectives is a common one. The parser does not always get it right.
  • In terms of particular bad choices it made, often they reflect usage/domain mismatches between the parser training data and the sentences you are trying. The training data is predominantly news articles – last millennium news articles for that matter – although we do mix in some other data and occasionally add to it. So:

    • The use of flagging as a verb, common in modern internet developer use, doesn't occur at all in the training data, so it not surprisingly tends to choose JJ for flagging, since that's the analysis of the only cases in the training data.
    • In news articles drinking is just more commonly a noun, with discussions of underage drinking, coffee drinking, drinking and driving, etc.

Upvotes: 3

Related Questions