John Doe
John Doe

Reputation: 55

Trying to replace stopwords in pandas dataframe, sre_constants.error occurs

I'm, having a problem with removing stopwords from pandas dataframe. My code goes like this:

for word in stopwords: 
  df['name'] = df['name'].str.replace(word, '')

An I get an error: sre_constants.error: nothing to repeat at position 0. Is there any solution to the error, or any other way to replace stopwords

Upvotes: 1

Views: 44

Answers (1)

Rakesh
Rakesh

Reputation: 82785

Try df.replace with regex=True:

Ex:

import pandas as pd
stopwords = ["AAA", "BBB"]
df = pd.DataFrame({"name": ["Hello", "World", "AAA", "BBB"]})
print( df["name"].replace("|".join(stopwords), "", regex=True))

Output:

0    Hello
1    World
2         
3         
Name: name, dtype: object

Upvotes: 1

Related Questions