Reputation: 1069
pandas version 0.13
d = {'one':['97628', '97628', '97628.271', '97628271'],
'two':['98800', '98800', '98800.000', '98800000']}
a = pd.DataFrame(d)
a
a.dtypes
one object
two object
dtype: object
Everything looks good up to this point. I then try to convert the strings into floats.
a.loc[:,'one'] = a.loc[:,'one'].astype(float)
a.loc[:,'two'] = a.loc[:,'two'].astype(float)
Nothing changes after I execute the code.
a.dtypes
one object
two object
dtype: object
The worst part is that the data in the dataframe has changed
Is this a bug or am I changing the data types incorrectly?
Upvotes: 0
Views: 193
Reputation: 375415
What's happening here is the conversion is happening correctly:
In [21]: a.loc[:,'one'].astype(float)
Out[21]:
0 97628.000
1 97628.000
2 97628.271
3 97628271.000
Name: one, dtype: float64
but it's being assigned to an object column (what you're seeing is formatting is simply number formatting - the numbers are correct).
A nice way to do this is to use convert_objects
:
In [11]: a.convert_objects(convert_numeric=True)
Out[11]:
one two
0 97628.000 98800
1 97628.000 98800
2 97628.271 98800
3 97628271.000 98800000
[4 rows x 2 columns]
In [12]: a.convert_objects(convert_numeric=True).dtypes
Out[12]:
one float64
two float64
dtype: object
Upvotes: 5