aldimeola1122
aldimeola1122

Reputation: 818

Stemming Algorithm

i have a question about Porter Stemmer Algorithm, I researched on the internet,

but i couldn't find what the difference between understemming and overstemming.

and is the Porter Algorithm understemming or overstamming?

do you have an idea?

Thanks in advance

Upvotes: 0

Views: 524

Answers (1)

ales_t
ales_t

Reputation: 2017

Overstemming happens when the cut-off suffix is too long, this leads to spurious matching of unrelated words.

Understemming is the opposite -- e.g. a stemmer that doesn't cut off anything inherently understems.

Porter Stemmer, I suspect, will do both types of errors from time to time, for English. Note that implementations for other languages might behave very differently (speaking about Snowball which has user-supplied algorithms for a bunch of languages). They may even differ in the linguistic definition of stem.

Upvotes: 1

Related Questions