Reputation: 3
enter image description here the code below should loop through a tweet dataset- text column and if a word not in stop words list, it should correct spelling, lemmatize, then stem the word. It is not working properly can you help me fix it? please check the error in the attached image
pstem = PorterStemmer()
lem = WordNetLemmatizer()
spell = SpellChecker()
stop_words = stopwords.words('english')
for i in range(len(df.index)):
text = df.loc[i]['text']
tokens = nltk.word_tokenize(text)
tokens = [word for word in tokens if word not in stop_words]
for j in range(len(tokens)):
tokens[j] = spell.correction(tokens[j])
tokens[j] = lem.lemmatize(tokens[j])
tokens[j] = pstem.stem(tokens[j])
tokens_sent=' '.join(tokens)
df.at[i,"text"] = tokens_sent
Upvotes: 0
Views: 73