Reputation: 10257
While experimenting with NLTK part of speech tagging, I noticed a lot of VBP
tags in the output of my calls to nltk.pos_tag
. I noticed this tag is not in the Brown Corpus part of speech tagset. It is however a part of the UPenn tagset.
What tagset does nltk use by default? I can't find this in the official documentation or the apidocs.
Upvotes: 8
Views: 4349
Reputation: 615
NLTK uses the Penn Treebank tagset as default. Others are available. Here a list of other taggers (with other tagsets) available as part of the NLTK library.
Upvotes: 0
Reputation: 136
It use POS tags used in the Penn Treebank Project. You can see the list of tags with there meaning on "http://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html"
Upvotes: 5
Reputation: 1480
Ntlk uses PennTreebank tagset . Have a look at this link http://nltk.org/api/nltk.tag.html
Upvotes: 8