Reputation: 1117
I have a pandas dataframe with below columns. Column_1 is string/text and not integer or decimal. Couple of rows have string value as well like Names (refer row no 6)
S.No. Column_1
1 256
2 1
3 $300.54672
4 756
5 $292.34333
6 Andrew
I want to convert all values in column_1 to numbers/int, except the dollar values and rows with Names. I require the dollar symbol to be retained but the amount should be rounded to 2 digits after decimal point.
Expected output:
S.No. Column_1
1 256
2 1
3 $300.55
4 756
5 $292.34
6 Andrew
I converted entire column to numeric using pd.to_numeric() with errors='coerce' but the amount values became blank (or) null as it was an error.
Any suggestions/help on this would be highly appreciated. Thank you.
Upvotes: 2
Views: 2457
Reputation: 863056
Filter values starting by $
by Series.str.startswith
, remove $
by Series.str.strip
, convert to numeric, round, convert to strings and prepend $
:
m = df['Column_1'].str.startswith('$', na=False)
s = '$' + df.loc[m, 'Column_1'].str.strip('$').astype(float).round(2).astype(str)
Or:
s = df.loc[m, 'Column_1'].str.strip('$').astype(float).round(2).astype(str).radd('$')
df.loc[m, 'Column_1'] = s
print (df)
S.No. Column_1
0 1 256
1 2 1
2 3 $300.55
3 4 756
4 5 $292.34
Last if need non matched values convert to numeric, but get mixed data types - strings with $
and numbers without $
:
df.loc[~m, 'Column_1'] = pd.to_numeric(df.loc[~m, 'Column_1'])
print (df)
S.No. Column_1
0 1 256
1 2 1
2 3 $300.54672
3 4 756
4 5 $292.34333
print (df['Column_1'].apply(type))
0 <class 'int'>
1 <class 'int'>
2 <class 'str'>
3 <class 'int'>
4 <class 'str'>
Name: Column_1, dtype: object
EDIT for last paragraph: Here is possible add errors='coerce'
for convert non numeric to missing values and then replace them by original:
df.loc[~m, 'Column_1'] = pd.to_numeric(df.loc[~m, 'Column_1'], errors='coerce').fillna(df['Column_1'])
print (df)
S.No. Column_1
0 1 256
1 2 1
2 3 $300.55
3 4 756
4 5 $292.34
5 6 Andrew
print (df['Column_1'].apply(type))
0 <class 'float'>
1 <class 'float'>
2 <class 'str'>
3 <class 'float'>
4 <class 'str'>
5 <class 'str'>
Name: Column_1, dtype: object
Upvotes: 3