Reputation: 3962
I've tried replacing the ',' with blank field.
df['amount'].replace(',','', regex=True).astype(float)
Error:
ValueError: could not convert string to float:
df['amount'] = df['amount'].astype('float64')
I still get the same error. The sample data looks like: 5,000.00 and 1,00,000.234 values.
How can I convert it to float?
Btw, I'm reading a json file! with only the path of the file.
Upvotes: 1
Views: 1282
Reputation: 23
Our values like 5,000.00 would be converted to 5000.00 as a float:
df['Withdrawal Amt.'] = [float(str(i).replace(",", "")) for i in df['Withdrawal Amt.']]
Upvotes: 0
Reputation: 862641
I think need assign back:
df = pd.DataFrame({'amount':['5,000.00', '1,00,000.234']})
df['amount'] = df['amount'].replace(',','', regex=True).astype('float64')
print (df)
amount
0 5000.000
1 100000.234
If does not work check if some bad values:
df = pd.DataFrame({'amount':['5,000.00', '1,00,000.234', 'a']})
print (df)
amount
0 5,000.00
1 1,00,000.234
2 a
print (df.loc[pd.to_numeric(df['amount'].replace(',','', regex=True), errors='coerce').isnull(), 'amount'])
2 a
Name: amount, dtype: object
Then is possible convert bad values to NaN
s:
df = pd.DataFrame({'amount':['5,000.00', '1,00,000.234', 'a']})
print (df)
amount
0 5,000.00
1 1,00,000.234
2 a
df['amount'] = pd.to_numeric(df['amount'].replace(',','', regex=True), errors='coerce', downcast='float')
print (df)
amount
0 5000.000
1 100000.234
2 NaN
If use pd.read_csv
for DataFrame
add parameter thousands=','
:
df = pd.read_csv(file, thousands=',')
Upvotes: 1
Reputation: 164673
Using pandas.to_numeric
with pd.Series.str.replace
works for this:
s = pd.Series(['5,000.00', '1,00,000.234'])
s = pd.to_numeric(s.str.replace(',', ''), downcast='float')
print(s)
# 0 5000.000
# 1 100000.234
# dtype: float64
However, a better idea is to fix this at source, if possible. For example, pandas.read_csv
has arguments which allow you to account for such numeric formatting.
Upvotes: 1