DataByDavid
DataByDavid

Reputation: 1069

Trouble converting strings into floats

pandas version 0.13

dummy dataframe

d = {'one':['97628', '97628', '97628.271', '97628271'],  
     'two':['98800', '98800', '98800.000', '98800000']}

a = pd.DataFrame(d)  
a

enter image description here

a.dtypes

one object
two object
dtype: object

Everything looks good up to this point. I then try to convert the strings into floats.

a.loc[:,'one'] = a.loc[:,'one'].astype(float)  
a.loc[:,'two'] = a.loc[:,'two'].astype(float)  

Nothing changes after I execute the code.

a.dtypes

one object
two object
dtype: object

The worst part is that the data in the dataframe has changed

enter image description here

Is this a bug or am I changing the data types incorrectly?

Upvotes: 0

Views: 193

Answers (1)

Andy Hayden
Andy Hayden

Reputation: 375415

What's happening here is the conversion is happening correctly:

In [21]: a.loc[:,'one'].astype(float)
Out[21]: 
0       97628.000
1       97628.000
2       97628.271
3    97628271.000
Name: one, dtype: float64

but it's being assigned to an object column (what you're seeing is formatting is simply number formatting - the numbers are correct).

A nice way to do this is to use convert_objects:

In [11]: a.convert_objects(convert_numeric=True)
Out[11]: 
            one       two
0     97628.000     98800
1     97628.000     98800
2     97628.271     98800
3  97628271.000  98800000

[4 rows x 2 columns]

In [12]: a.convert_objects(convert_numeric=True).dtypes
Out[12]: 
one    float64
two    float64
dtype: object

Upvotes: 5

Related Questions