Reputation: 115
I have a DataFrame with some NaN values. In this DataFrame there are some rows with all NaN values. When I apply sum function on these rows, it is returning zero instead of NaN. Code is as follows:
df = pd.DataFrame(np.random.randint(10,60,size=(5,3)),
index = ['a','c','e','f','h'],
columns = ['One','Two','Three'])
df = df.reindex(index=['a','b','c','d','e','f','g','h'])
print(df.loc['b'].sum())
Any Suggestion?
Upvotes: 0
Views: 696
Reputation: 30930
the sum function takes the NaN values as 0.
if you want the result of the sum of NaN values to be NaN:
df.loc['b'].sum(min_count=1)
Output:
nan
if you apply to all rows( after using reindex) you will get the following:
df.sum(axis=1,min_count=1)
a 137.0
b NaN
c 79.0
d NaN
e 132.0
f 95.0
g NaN
h 81.0
dtype: float64
if you now modify a NaN value of a row:
df.at['b','One']=0
print(df)
One Two Three
a 54.0 20.0 29.0
b 0.0 NaN NaN
c 13.0 24.0 27.0
d NaN NaN NaN
e 28.0 53.0 25.0
f 46.0 55.0 50.0
g NaN NaN NaN
h 47.0 26.0 48.0
df.sum(axis=1,min_count=1)
a 103.0
b 0.0
c 64.0
d NaN
e 106.0
f 151.0
g NaN
h 121.0
dtype: float64
as you can see now the result of row b is 0
Upvotes: 1