Padfoot123
Padfoot123

Reputation: 1117

Pandas dataframe amount value with dollar symbol

I have a pandas dataframe with below columns. Column_1 is string/text and not integer or decimal. Couple of rows have string value as well like Names (refer row no 6)

S.No.  Column_1
1      256
2      1
3      $300.54672
4      756
5      $292.34333
6      Andrew

I want to convert all values in column_1 to numbers/int, except the dollar values and rows with Names. I require the dollar symbol to be retained but the amount should be rounded to 2 digits after decimal point.

Expected output:

S.No.  Column_1
1           256
2             1
3       $300.55
4           756
5       $292.34
6       Andrew

I converted entire column to numeric using pd.to_numeric() with errors='coerce' but the amount values became blank (or) null as it was an error.

Any suggestions/help on this would be highly appreciated. Thank you.

Upvotes: 2

Views: 2457

Answers (1)

jezrael
jezrael

Reputation: 863056

Filter values starting by $ by Series.str.startswith, remove $ by Series.str.strip, convert to numeric, round, convert to strings and prepend $:

m = df['Column_1'].str.startswith('$', na=False)

s = '$' + df.loc[m, 'Column_1'].str.strip('$').astype(float).round(2).astype(str)

Or:

s = df.loc[m, 'Column_1'].str.strip('$').astype(float).round(2).astype(str).radd('$')

df.loc[m, 'Column_1'] = s


print (df)
   S.No. Column_1
0      1      256
1      2        1
2      3  $300.55
3      4      756
4      5  $292.34

Last if need non matched values convert to numeric, but get mixed data types - strings with $ and numbers without $:

df.loc[~m, 'Column_1'] = pd.to_numeric(df.loc[~m, 'Column_1'])
print (df)
   S.No.    Column_1
0      1         256
1      2           1
2      3  $300.54672
3      4         756
4      5  $292.34333

print (df['Column_1'].apply(type))
0    <class 'int'>
1    <class 'int'>
2    <class 'str'>
3    <class 'int'>
4    <class 'str'>
Name: Column_1, dtype: object

EDIT for last paragraph: Here is possible add errors='coerce' for convert non numeric to missing values and then replace them by original:

df.loc[~m, 'Column_1'] = pd.to_numeric(df.loc[~m, 'Column_1'], errors='coerce').fillna(df['Column_1'])
print (df)
   S.No. Column_1
0      1      256
1      2        1
2      3  $300.55
3      4      756
4      5  $292.34
5      6   Andrew

print (df['Column_1'].apply(type))

0    <class 'float'>
1    <class 'float'>
2      <class 'str'>
3    <class 'float'>
4      <class 'str'>
5      <class 'str'>
Name: Column_1, dtype: object

Upvotes: 3

Related Questions