Blue482
Blue482

Reputation: 3156

Can I choose a pos.model in Stanford parser?

I want to use gate-EN-twitter.model for pos tagging when in the process of parsing by Stanford parser. Is there an option on command line that does that? like -pos.model gate-EN-twitter.model? Or do I have to use Stanford pos tagger with gate model for tagging first then use its output as input for the parser?

Thanks!

Upvotes: 1

Views: 471

Answers (1)

Jon Gauthier
Jon Gauthier

Reputation: 25592

If I understand you correctly, you want to force the Stanford Parser to use the tags generated by this Twitter-specific POS tagger. That's definitely possible, though this tweet from Stanford NLP about this exact model should serve as a warning:

Tweet from Stanford NLP, 13 Apr 2014:

Using CoreNLP on social media? Try GATE Twitter model (iff not parsing…) -pos.model gate-EN-twitter.model https://gate.ac.uk/wiki/twitter-postagger.html #nlproc

(https://twitter.com/stanfordnlp/status/455409761492549632)

That being said, if you really want to try, we can't stop you :)

There is a parser FAQ entry on forcing in your own tags. See http://nlp.stanford.edu/software/parser-faq.shtml#f

Basically, you have two options (see the FAQ for full details):

  • If calling the parser from the command line, you can pre-tag your text file and then alert the parser to the fact that the text is pre-tagged using some command-line options.
  • If parsing programmatically, the LexicalizedParser#parse method will accept any List<? extends HasTag> and treat the tags in that list as golden. Just pre-tag your list (using the CoreNLP pipeline or MaxentTagger) and pass on that token list to the parser.

Upvotes: 1

Related Questions