Reputation: 55
I'm, having a problem with removing stopwords from pandas dataframe. My code goes like this:
for word in stopwords:
df['name'] = df['name'].str.replace(word, '')
An I get an error: sre_constants.error: nothing to repeat at position 0. Is there any solution to the error, or any other way to replace stopwords
Upvotes: 1
Views: 44
Reputation: 82785
Try df.replace
with regex=True
:
Ex:
import pandas as pd
stopwords = ["AAA", "BBB"]
df = pd.DataFrame({"name": ["Hello", "World", "AAA", "BBB"]})
print( df["name"].replace("|".join(stopwords), "", regex=True))
Output:
0 Hello
1 World
2
3
Name: name, dtype: object
Upvotes: 1