Reputation: 329
The example dataframe I have is-
>>> new_df
date country score
0 2018-01-01 ch 50
1 2018-01-01 es 100
2 2018-01-01 us 150
3 2018-01-02 ch 10
4 2018-01-02 gb 100
5 2018-01-02 us 125
6 2018-01-03 us 160
Why does new_df.groupby(["date", "country"]).diff()
produce Nan?
>>> new_df.groupby(["date", "country"]).diff()
score
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
6 NaN
Upvotes: 0
Views: 882
Reputation: 30930
As you can see the size of each group is 1,
then the subnetting of the subtraction is NaN
because to make the subtraction a minuend and a subtraend are needed, that is to say size at least equal to 2:
df.groupby(['date','country']).size()
date country
2018-01-01 ch 1
es 1
us 1
2018-01-02 ch 1
gb 1
us 1
2018-01-03 us 1
dtype: int64
Upvotes: 2
Reputation: 249444
It's because there is nothing to subtract--you have only one value per group in your example.
Upvotes: 1