Reputation: 818
i have a question about Porter Stemmer Algorithm, I researched on the internet,
but i couldn't find what the difference between understemming and overstemming.
and is the Porter Algorithm understemming or overstamming?
do you have an idea?
Thanks in advance
Upvotes: 0
Views: 524
Reputation: 2017
Overstemming happens when the cut-off suffix is too long, this leads to spurious matching of unrelated words.
Understemming is the opposite -- e.g. a stemmer that doesn't cut off anything inherently understems.
Porter Stemmer, I suspect, will do both types of errors from time to time, for English. Note that implementations for other languages might behave very differently (speaking about Snowball which has user-supplied algorithms for a bunch of languages). They may even differ in the linguistic definition of stem.
Upvotes: 1