Error trying to convert Object to Float in python

Question

I have a file that lists deposit balances as strings. IN order to plot these numbers, I'm trying to convert the Objects to a float. So I wrote code to remove the $ and to take out spaces before and after the values.

member_clean.TotalDepositBalances = member_clean.TotalDepositBalances.str.replace('$', '')

member_clean['TotalDepositBalances'] = member_clean['TotalDepositBalances'].str.strip()

member_clean['TotalDepositBalances'] = member_clean['TotalDepositBalances'].astype(float)

When I run the code, I get an error message that says

ValueError: could not convert string to float:

That's it. Before I added the str.strip, the error message showed me that some values had spaces before and after, so I knew to remove those. But I'm a little confused what else is causing it,

I looked at the values of the column after I removed the spaces and $, and everything looks normal. Here's a sample.

309.00
38.00
12,486.00
6,108.00
2,537.00

Any ideas of what I could check for in the columns that may be causing this error

Massifox · Accepted Answer

You have to delete the commas, they are not a numeric format recognized by Python. So considering the list you gave as possible input:

str_num = ['309.00 ', ' 38.00 ', ' 12,486.00 ', '6,108.00', ' 2,537.00']

you have to do this:

list(map(lambda s: float (s.replace (',', '')), str_num))

and gives your list of float:

[309.0, 38.0, 12486.0, 6108.0, 2537.0]

Note: You don't need to do str.strip() because the spaces are automatically deleted from the float casting operation.

Following your pipeline, before converting to float, you need to do:

member_clean['TotalDepositBalances'] = member_clean['TotalDepositBalances'].str.replace(',', '')

Or you can run your entire pipeline on one line of code as follows:

member_clean['TotalDepositBalances'] = member_clean['TotalDepositBalances'].replace('$', '').replace(',', '').astype(float)

Extra: Performance

Here you will find tests that present a comparison of various methods for performing multiple substitutions inserted in a string. Surprisingly use replace in cascade (as in your pipeline), it turns out to be more efficient than a regex for this type of operation. Give it a reading.

Error trying to convert Object to Float in python

Answers (2)

Extra: Performance

Related Questions