Reputation: 7048
I have a column with currency values. All have a leading '$' and some have a trailing F. Trying to remove the charcters using replace with pandas.
My data is:
$4.00F
$21.00
$6.00
$6.00
$5.50
$13.00
$4.60F
This is my code using replace. However, it removes no characters at all.
df["Price"].replace("^/$$[F]", "", regex=True)
Where is my regex incorrect?
Upvotes: 1
Views: 90
Reputation: 75900
In this instance you can simply avoid any regex and use .strip('$F')
:
df['Price'] = df['Price'].str.strip('$F')
Upvotes: 1
Reputation: 627065
The ^/$$[F]
pattern matches a string that starts with /
, then makes sure the string ends right after /
(twice) and then the regex engine searches for F
(after the end of string). As you see, the pattern has no sense.
You can use
df["Price"] = df["Price"].str.replace(r"^\$|F$", "", regex=True)
This will remove a $
at the start and F
at the end.
See the regex demo.
If you do not care where the $
and F
are, you may use
df["Price"] = df["Price"].str.replace(r"[$F]+", "", regex=True)
See this regex demo where [$F]+
matches one or more F
or $
chars.
Also, consider the classic pattern removing any chars other than digits and dots:
df["Price"] = df["Price"].str.replace(r"[^0-9.]+", "", regex=True)
See this regex demo.
Upvotes: 1
Reputation: 120469
If you want to remove any character (not just '$' and 'F') that is not a number at the start and the end of your string:
>>> df['Price'].str.replace(r'^[^\d+]|[^\d+]$', '', regex=True)
0 4.00
1 21.00
2 6.00
3 6.00
4 5.50
5 13.00
6 4.60
Name: Price, dtype: object
Upvotes: 0