Reputation: 13426
I have a dataframe
df = pd.DataFrame({'a':[1,2,3], 'b':[5, '12$sell', '1$sell']})
I want to replace $sell from column b.
So I tried replace()
method like below
df['b'] = df['b'].str.replace("$sell","")
but it's doesn't replace the given string and it gives me same dataframe as original.
It's working when I use it with apply
df['b'] = df['b'].apply(lambda x: str(x).replace("$sell",""))
So I want to know why it is not working in previous case?
Note: I tried replacing only $ and shockingly it works.
Upvotes: 5
Views: 2700
Reputation: 863146
It is regex metacharacter (end of string), escape it or add parameter regex=False
:
df['b'] = df['b'].str.replace("\$sell","")
print (df)
a b
0 1 NaN
1 2 12
2 3 1
df['b'] = df['b'].str.replace("$sell","", regex=False)
If want also value 5, what is numeric, use Series.replace
with regex=True for replace substrings - numeric values are not touched:
df['b'] = df['b'].replace("\$sell","", regex=True)
print (df['b'].apply(type))
0 <class 'int'>
1 <class 'str'>
2 <class 'str'>
Name: b, dtype: object
Or cast to strings all data of column:
df['b'] = df['b'].astype(str).str.replace("$sell","", regex=False)
print (df['b'].apply(type))
0 <class 'str'>
1 <class 'str'>
2 <class 'str'>
Name: b, dtype: object
And for better performance if no missing values is possible use list comprehension:
df['b'] = [str(x).replace("$sell","") for x in df['b']]
print (df)
a b
0 1 5
1 2 12
2 3 1
Upvotes: 7
Reputation: 4417
str.replace assumes a regex is being used. so you need to use escape i.e.
df['b'] = df['b'].str.replace("\$sell","")
Upvotes: 4
Reputation: 164753
$
is a regex special character. By default, pd.Series.str.replace
uses regex=True
.
Instead, specify regex=False
:
df['b'] = df['b'].str.replace('$sell', '', regex=False)
Upvotes: 4