Reputation: 2981
I have a dataframe in pandas. I am taking sum of a column of a dataframe as:
x = data['col1'].sum(axis=0)
print(type(x))
I have checked that col1
column in data
dataframe is of type float64
. But the type of x
is <class 'float'>
. I was expecting the type of x
to be numpy.float64
.
What is it that I am missing here?
pandas version - '0.18.0', numpy version - '1.10.4', python version - 3.5.2
Upvotes: 4
Views: 5550
Reputation: 2690
This seems to be from the way that pandas is handling nans. When I set skipna=False
in the sum
method I get the numpy
datatype
import pandas as pd
import numpy as np
type(pd.DataFrame({'col1':[.1,.2,.3,.4]}).col1.sum(skipna=True))
#float
type(pd.DataFrame({'col1':[.1,.2,.3,.4]}).col1.sum(skipna=False))
#numpy.float64
The sum
method is calling nansum
from pandas/core/nanops.py
, which produces the same behaviours.
from pandas.core.nanops import nansum
type(sum(np.arange(10.0)))
# numpy.float64
type(nansum(np.arange(10.0)))
# float
Why nansum
is converting from numpy.float64
to float
, I couldn't tell you. I've looked at the nansum
source code, but none of the functions it itself calls seem to be producing that change.
Upvotes: 3