Reputation: 2545
I used the following code to train a ClassifierBasedPOSTagger
for POS tagging:
from nltk.classify import MaxentClassifier
from nltk.tag.sequential import ClassifierBasedPOSTagger
me_tagger = ClassifierBasedPOSTagger(train=train_sents, classifier_builder=lambda train_feats: MaxentClassifier.train(train_feats, max_iter=15))
print(me_tagger.tag('My new watch is awesome...'.split()))
Which prints out the following tags:
[('My', 'PP$'), ('new', 'JJ'), ('watch', 'NN'), ('is', 'BEZ'), ('awesome...', 'AT')]
Where can I find the token tag definitions for this classifier? I am familiar with these tokens though, but I am unable to construe BEZ
and AT
.
Upvotes: 0
Views: 199
Reputation: 50220
You should understand that the tagset has nothing to do with the classifier class you chose; the tagset comes from your training data. So your question should have been "where do I find the tag definitions for (this POS-tagged corpus)". You don't say where your train_sents
came from, but indeed (as @RAVI already pointed out) these tags seem to come from the Brown corpus; you can read its tagset documentation online, or fetch it from within the nltk like this:
>>> nltk.help.brown_tagset("BEZ")
BEZ: verb 'to be', present tense, 3rd person singular
is
>>> nltk.help.brown_tagset() # All tags
...
Upvotes: 1
Reputation: 3153
You can check - The Brown Corpus Tag-set.
╔═════╦═════════════════════╦════════════════════╗
║ Tag ║ Description ║ Examples ║
╠═════╬═════════════════════╬════════════════════╣
║ AT ║ article ║ the an no a every ║
║ ║ ║ th' ever' ye ║
╠═════╬═════════════════════╬════════════════════╣
║ BEZ ║ verb "to be", ║ is ║
║ ║ present tense, ║ ║
║ ║ 3rd person singular ║ ║
╠═════╬═════════════════════╬════════════════════╣
║ ... ║ ... ║ ... ║
╚═════╩═════════════════════╩════════════════════╝
Upvotes: 2