krackoder
krackoder

Reputation: 2981

sum of 'float64' column type in pandas return float instead of numpy.float64

I have a dataframe in pandas. I am taking sum of a column of a dataframe as:

x = data['col1'].sum(axis=0)
print(type(x))

I have checked that col1 column in data dataframe is of type float64. But the type of x is <class 'float'>. I was expecting the type of x to be numpy.float64.

What is it that I am missing here?

pandas version - '0.18.0', numpy version - '1.10.4', python version - 3.5.2

Upvotes: 4

Views: 5550

Answers (1)

Elliot
Elliot

Reputation: 2690

This seems to be from the way that pandas is handling nans. When I set skipna=False in the sum method I get the numpy datatype

import pandas as pd
import numpy as np

type(pd.DataFrame({'col1':[.1,.2,.3,.4]}).col1.sum(skipna=True))
#float

type(pd.DataFrame({'col1':[.1,.2,.3,.4]}).col1.sum(skipna=False))
#numpy.float64

The sum method is calling nansum from pandas/core/nanops.py, which produces the same behaviours.

from pandas.core.nanops import nansum

type(sum(np.arange(10.0)))
# numpy.float64

type(nansum(np.arange(10.0)))
# float

Why nansum is converting from numpy.float64 to float, I couldn't tell you. I've looked at the nansum source code, but none of the functions it itself calls seem to be producing that change.

Upvotes: 3

Related Questions