Karanam Krishna
Karanam Krishna

Reputation: 365

What is the real purpose of Stemming in NLP?

I know about stemming and lemmatizing as follows:
stemming - converts words into non-changing portions;amusing, amusement - amus
lemmatizing - converts words to dictionary form ; amusing, amusement - amuse
I can understand why to use lemmatization. But I dont get the purpose behind doing stemming ? Can you explain ?

Upvotes: 2

Views: 212

Answers (1)

Sociopath
Sociopath

Reputation: 13401

As you said stemming - converts words into non-changing portions

and lemmatizing - converts words to dictionary form

Machine Learning algorithms like BOW or tf-idf are related to word frequency

Let's take an example you provided in your question.

with stemming

amusing, amusement both words returns amus so these words will be treated as same and frequency for amus will be 2.

with lemmatization amusing, amusement both words returns amuse so again these words will be treated as same and frequency for amuse will be 2

In your model it doesn't matter(in this particular case) if you use either stemming or lemma

Stemming just stripping the letters from the word while lemmatization requires looking into dictionary to find related word so obviously is faster stemming than lemmatization

So you can choose stemming over lemmatization if you want to speed up preprocessing

Disadvantage

In case of stemming

studying will give study and studies will give studi

even those words have same root, these words will be treated as different

Upvotes: 2

Related Questions