Reputation: 1323
I have a column of a Pandas DataFrame called "Steuersatz".
This column is made of the following unique strings:
array(['19,00%', '0,00%', '5,00%', '4,64%', '4,04%', '4,10%', '1,63%', '3,55%',
'1,14%', '0,96%', '11,31%', '12,35%', '10,45%', '11,00%', '12,99%',
'10,83%', '6,82%', '11,50%', '16,00%', '3,30%', '4,00%', '4,16%',
'4,15%', '10,38%', '11,43%', '11,58%'], dtype=object)
I am trying to match patterns such that if the number is 19,00 or anything with 00 at the end, it should instead display 19% or just that digit and %
Here is what I am doing to solve this problem:
df["Steuersatz"] = df["Steuersatz"].map("{:,.2f}%".format)
df["Steuersatz"] = df["Steuersatz"].str.replace(".",",")
df['Steuersatz'] = df['Steuersatz'].str.replace("19,00%","19%")
df['Steuersatz'] = df['Steuersatz'].str.replace("0,00%","0%")
df['Steuersatz'] = df['Steuersatz'].str.replace("11,00%","11%")
df['Steuersatz'] = df['Steuersatz'].str.replace("5,00%","5%")
df['Steuersatz'] = df['Steuersatz'].str.replace("4,00%","4%")
df['Steuersatz'] = df['Steuersatz'].str.replace("16,00%","16%")
To me, this is inefficient, I am looking at doing this automatically rather than checking to manually replace.
Many thanks for your input
Upvotes: 0
Views: 100
Reputation: 2720
Why not just replace ,00
with an empty string? pd.Series.str.replace
is able to handle regex (actually does that by default) and thus can perform partial matching:
df['Steuersatz'] = df['Steuersatz'].str.replace(",00","")
This not only removes several repeated lines from your code but also handles new cases, say 23,00%
Upvotes: 2