Reputation: 490
I am trying to bin a Pandas DataFrame into three day windows. I have two columns, A and B, which I want to sum in each window. This code which I wrote for the task
df = df.groupby(df.index // 3).agg({'A': 'sum', 'B':'sum'})
Converts NaN values to zero when doing this sum, but I would like them to remain NaN as my data has actual non-NaN zero values.
For example if I had this df:
df = pd.DataFrame([
[np.nan, np.nan],
[np.nan, 0],
[np.nan, np.nan],
[2, 0],
[4 , 0],
[0 , 0]
], columns=['A','B'])
Index A B
0 NaN Nan
1 NaN 3
2 NaN Nan
3 2 0
4 4 0
5 0 0
I would like the new df to be:
Index A B
0 NaN 3
1 6 0
But my current code outputs:
Index A B
0 0 3
1 6 0
Upvotes: 7
Views: 5012
Reputation: 104
df.groupby(df.index // 3)['A', 'B'].mean()
The above snippet provides the mentioned sample output.
If you want to go for the sum, look at df.groupby(df.index // 3)['A', 'B'].sum(min_count = 1)
Another option:
df.groupby(df.index // 3).agg({'A': lambda x: x.sum(skipna=False),
'B':lambda x: x.sum(skipna=True)})
Upvotes: 2
Reputation: 2222
Try with this code:
df.groupby(df.index // 3).agg({'A': lambda x: x.sum(skipna=False),
'B':lambda x: x.sum(skipna=False)})
Out[282]:
A B
0 NaN NaN
1 6.0 0.0
Upvotes: 0