Plasma
Plasma

Reputation: 1961

"10.0" is removed entirely when replacing ".0" with ""

I have a dataframe column that contains strings of floating numbers, and I want to remove the trailing ".0" where applicable. However, when doing df["numbers"].str.replace(".0", ""), the string "10.0" is removed entirely, instead of being replaced with "10". This only seems to affect the number 10, 100, etc.

MWE:

import pandas as pd
df = pd.DataFrame({"numbers": ["1.0", "10.0", "10.1", "100.0", "100.1", "99.0"]})
print df
#   numbers
# 0     1.0
# 1    10.0
# 2    10.1
# 3   100.0
# 4   100.1
# 5    99.0
print df.numbers.str.replace(".0", "")
# 0      1
# 1
# 2     .1
# 3      0
# 4    0.1
# 5     99

Is this a bug or is it working as intended? Also notice that "10.1" is changed to ".1" with this approach, which is weird.

Upvotes: 3

Views: 3107

Answers (2)

jezrael
jezrael

Reputation: 863301

Need $ for match end of strings and escape . by \:

print (df.numbers.str.replace("\.0$", ""))
0        1
1       10
2     10.1
3      100
4    100.1
5       99
Name: numbers, dtype: object

Upvotes: 6

Katriel
Katriel

Reputation: 123762

Dataframe.str.replace takes a regular expression, and thus the . is matching any character. You want

df.numbers.str.replace("\.0", "")

Upvotes: 12

Related Questions