Usal
Usal

Reputation: 45

Strange output of sum of DataFrame. Why does df.sum() return Series([], dtype: float64)?

The code

df=pd.DataFrame({'Col1':[1, '-','-'], 'Col2':['-','-',2]})
df.sum()

returns

Series([], dtype: float64)

Assuming that the strings may cause difficulty, but this also does not work:

df.sum(numeric_only=True)

returns

Series([], dtype: float64)

I don't understand what the output is trying to tell me and why I am getting it in the first place.

Only if I replace the strings with zeroes, than the result is as expected:

df.replace('-',0).sum()

returns

Col1    1
Col2    2
dtype: int64

Upvotes: 0

Views: 838

Answers (2)

Usal
Usal

Reputation: 45

Its a bug. Submitted and recognized as issue #39903.

Upvotes: 1

Giovanni Frison
Giovanni Frison

Reputation: 688

I assume that the function sum of pandas works as the built-in python one.. therefore, if there is only one number, you don't have an iterable, and therefore nothing to sum.

It is like you are calling sum(1), the answer will be : TypeError: 'int' object is not iterable

While sum([1,0,0]) will give you an answer even if trivial

Upvotes: 0

Related Questions