Ghadah
Ghadah

Reputation: 3

loop in tweet text and if word not in stop word correct spelling, lemmatize, and stem

enter image description here the code below should loop through a tweet dataset- text column and if a word not in stop words list, it should correct spelling, lemmatize, then stem the word. It is not working properly can you help me fix it? please check the error in the attached image

pstem = PorterStemmer()
lem = WordNetLemmatizer()
spell = SpellChecker()
stop_words = stopwords.words('english')

for i in range(len(df.index)):
    text = df.loc[i]['text']
    tokens = nltk.word_tokenize(text)
    tokens = [word for word in tokens if word not in stop_words] 
    for j in range(len(tokens)):
        tokens[j] = spell.correction(tokens[j])
        tokens[j] = lem.lemmatize(tokens[j])
        tokens[j] = pstem.stem(tokens[j])
    tokens_sent=' '.join(tokens)
    df.at[i,"text"] = tokens_sent 

Upvotes: 0

Views: 73

Answers (0)

Related Questions