Yanick Nedderhoff
Yanick Nedderhoff

Reputation: 1234

Stanford parser output doesn't match demo output

If I use the Stanford CoreNLP neural network dependency parser with the english_SD model, which performed pretty good according to the website (link, bottom of the page), it provides completely different results compared to this demo, which I assume is based on the LexicalizedParser (or at least any other one).

If I put the sentence I don't like the car in the demo page, this is the result:

enter image description here

If I put the same sentence into the neural network parser, it results in this:

enter image description here

In the result of the neural network parser, everything just depends on like. I think it could be due to the different POS-Tags, but I used the CoreNLP Maxent Tagger with the english-bidirectional-distsim.tagger model, so pretty common I think. Any ideas on this?

Upvotes: 0

Views: 299

Answers (1)

Sebastian Schuster
Sebastian Schuster

Reputation: 1563

By default, we use the english-left3words-distsim.tagger model for the tagger which is faster than the bidirectional model but occasionally produces worse results. As both, the constituency parser which is used on the demo page, and the neural network dependency parser which you used, heavily rely on POS tags it is not really surprising that the different POS sequences lead to different parses, especially when the main verb has a function word tag (IN = prepositon) instead of a content word tag (VB = verb, base form).

But also note that the demo outputs dependency parses in the new Universal Dependencies representation, while the english_SD model parses sentences to the old Stanford Dependencies representation. For your example sentence, the correct parses are actually the same but you will see differences for other sentences, especially if they have prepositional phrases which are being treated differently in the new representation.

Upvotes: 2

Related Questions