Reputation: 3
i'm new to python. I have a NLP project and need to remove the frequencies from my keywords. I successsfully did it on one row i made into a list.
So the input: tokens= ['fibre', '16', ';', 'quoi', '1', ';', 'dangers', '1',]
using
tokens = [word for word in tokens if word.isalpha()
output is this ['fibre', 'quoi', 'dangers', ]
Now i would like to apply this to the whole column. This is what I have:
from nltk import word_tokenize,sent_tokenize
tokens = cleaningkey.apply(word_tokenize)
tokens.head(5)
output:
0 [fibre, 16, ;, quoi, 1, ;, dangers, 1, ;, comb...
1 [restaurant, 1, ;, marrakech.shtml, 1]
2 [payer, 1, ;, faq, 1, ;, taxe, 1, ;, habitatio...
3 [rigaud, 3, ;, laurent, 3, ;, photo, 11, ;, pr...
4 [societe, 1, ;, disparition, 1, ;, proche, 1, ...
Name: text_norm, dtype: object
I tried different things but keep getting (list' object has no attribute 'isalpha'). Could someone tell me how to proceed?
Thanks!
Upvotes: 0
Views: 468
Reputation: 57085
You should apply the test function to each item of the list:
cleaningkey.apply(lambda lst: [word for word in lst if word.isalpha()])
#0 [fibre, quoi, dangers]
#1 [restaurant]
Alternatively:
df.cleaningkey.apply(lambda lst: list(filter(str.isalpha, lst)))
#0 [fibre, quoi, dangers]
#1 [restaurant]
Upvotes: 1