Reputation: 11575
I have a very large Pandas DataFrame that looks like this:
>>> d = pd.DataFrame({"a": ["1", "U", "3.4"]})
>>> d
a
0 1
1 U
2 3.4
Currently the column is set as an object
:
>>> d.dtypes
a object
dtype: object
I'd like to convert this column to float so that I can use groupby()
and compute the mean. When I try it using astype
I correctly get an error because of the string that can't be cast to float:
>>> d.a.astype(float)
ValueError: could not convert string to float: 'U'
What I'd like to do is to cast all the elements to float, and then replace the ones that can't be cast by NaNs.
How can I do this?
I tried setting raise_on_error
, but it doesn't work, the dtype
is still object
.
>>> d.a.astype(float, raise_on_error=False)
0 1
1 U
2 3.4
Name: a, dtype: object
Upvotes: 4
Views: 5351
Reputation: 176810
Use to_numeric
and specify errors='coerce'
to force strings that can't be parsed to a numeric value to become NaN
:
>>> pd.to_numeric(d['a'], errors='coerce')
0 1.0
1 NaN
2 3.4
Name: a, dtype: float64
Upvotes: 13