lydias
lydias

Reputation: 841

Python syntax error in list comprehension on string for Lemmatization

I'm trying to only perform Lemmatization on words in a string that have more than 4 letters. The desired output from the following code should be 'us american', but I received an invalid syntax error.

import nltk
from nltk.tokenize import TweetTokenizer
lemmatizer = nltk.stem.WordNetLemmatizer()
w_tokenizer = TweetTokenizer()    

wd = w_tokenizer.tokenize(('us americans'))
    [lemmatizer.lemmatize(w) for w in wd if len(w)>4 else wd for wd in w]

Upvotes: 1

Views: 56

Answers (1)

lemon
lemon

Reputation: 15482

You could try with this list comprehension:

[lemmatizer.lemmatize(w) if len(w)>4 else w for w in wd]

Then, if you want a single string considering your input sample, you can use the Python join operation on strings:

' '.join([lemmatizer.lemmatize(w) if len(w)>4 else w for w in wd])

Upvotes: 1

Related Questions