Reputation: 11420
I am trying to get my hands dirty on nltk. I am referring http://victoria.lviv.ua/../NaturalLanguageProcessingWithPython.pdf. It states that nltk.pos_tag
function assigns parts of speech to each word in the list of words, passed to it as argument.
Moving ahead, I found that there's also nltk.DefaultTagger
, nltk.RegexpTagger
, nltk.UnigramTagger
and nltk.BigramTagger
.
I am confused over, why we require these taggers, since nltk.pos_tag
is doing good job of tagging parts of speech. Moreover, which tagger does nltk.pos_tag
uses internally for tagging.
Thanks in advance.
Upvotes: 1
Views: 1450
Reputation: 122148
The default nltk.pos_tag
is
PerceptronTagger
model The data and walk-through documentation can be found on:
The UnigramTagger
and BigramTagger
are class objects that contains no pre-trained model.
Chapter 5 of the NLTK book provides an introduction POS Tagger available http://www.nltk.org/book/ch05.html:
DefaultTagger
: Chapter 5, Section 4.1RegexpTagger
: Chapter 5, Section 4.2NgramTagger
: Chapter 5, Section 5.3Upvotes: 2