Dakshila Kamalsooriya
Dakshila Kamalsooriya

Reputation: 1401

How to get a nested list by stemming the words inside the nested lists?

I've a Python list with several sub lists having tokens as tokens. I want to stem the tokens in it so that the output will be as stemmed_expected.

tokens = [['cooked', 'lovely','baked'],['hotel', 'going','liked'],['room','looking']]

stemmed_expected: [['cook', 'love','bake'],['hotel', 'go','like'],['room','look']]

The for loop I tried is follows:

from nltk.stem import PorterStemmer  
ps = PorterStemmer()

stemmed_actual = []

for m in tokens:
    for word in m:
        word = ps.stem(word)
        stemmed_actual.append(word)

But the output of this for loop is:

stemmed_actual = ['cook', 'love', 'bake', 'hotel', 'go', 'like', 'room', 'look']

How can I modify the for loop to get the stemmed words in sub lists as it is in stemmed_expected?

Upvotes: 2

Views: 146

Answers (1)

j1-lee
j1-lee

Reputation: 13939

You can use nested list comprehension:

from nltk.stem import PorterStemmer

tokens = [['cooked', 'lovely','baked'],['hotel', 'going','liked'],['room','looking']]

ps = PorterStemmer()
stemmed = [[ps.stem(word) for word in sublst] for sublst in tokens]

print(stemmed)
# [['cook', 'love', 'bake'], ['hotel', 'go', 'like'], ['room', 'look']]

Upvotes: 2

Related Questions