Reputation: 4739
I have a df with currency:
df = pd.DataFrame({'Currency':['$1.00','$2,000.00','(3,000.00)']})
Currency
0 $1.00
1 $2,000.00
2 (3,000.00)
I want to convert the 'Currency' dtype to float but I am having trouble with the parentheses string (which indicate a negative amount). This is my current code:
df[['Currency']] = df[['Currency']].replace('[\$,]','',regex=True).astype(float)
which produces an error:
ValueError: could not convert string to float: (3000.00)
What I want as dtype float is:
Currency
0 1.00
1 2000.00
2 -3000.00
Upvotes: 25
Views: 27842
Reputation: 11701
Here is a utility function which could take in pandas dataframe and handle currency based data in any of the columns.
def handle_currency_column(df: pd.DataFrame) -> pd.DataFrame:
for column in df.columns:
try:
df[column] = (
df[column]
.replace(r"[\$,)]", "", regex=True)
.replace("[(]", "-", regex=True)
.replace(" ", "", regex=True)
.replace("", "NaN", regex=True)
.astype(float)
)
except Exception as e:
# do nothing if column is a non numeric string column
print(
f"Error while converting column {column} to float. Error: {e}"
)
continue
return df
Usage :
...
df = handle_currency_column(df)
Upvotes: 0
Reputation: 33
This is if you want to make sure its added to the DataFrame.
df['Currency']=(df['Currency'].replace( '[\$,)]','', regex=True ) .replace( '[(]','-', regex=True ).astype(float))
Upvotes: 1
Reputation: 30424
Just add )
to the existing command, and then convert (
to -
to make numbers in parentheses negative. Then convert to float.
(df['Currency'].replace( '[\$,)]','', regex=True )
.replace( '[(]','-', regex=True ).astype(float))
Currency
0 1
1 2000
2 -3000
Upvotes: 47