user3738579
user3738579

Reputation: 33

How to shift entire groups in pandas groupby

Given the following data:

data = {'a' : [1,1,1,8,8,3,3,3,3,4,4] }
df = pd.DataFrame(data)

I would now like to shift the whole thing down by n groups, so that their current order is preserved. The desired output for a shift of n=1 would be:

desired_output = {'a': [NaN,NaN,NaN,1,1,8,8,8,8,3,3] }
desired_output_df = pd.DataFrame(desired_output)

a shift of n=2 should be:

desired_output = {'a': [NaN,NaN,NaN,NaN,NaN,1,1,1,1,8,8] }
desired_output_df = pd.DataFrame(desired_output)

I have been messing around with groupby/transform/apply but haven't gotten anything to work so far. If I groupby and then shift, it shifts each group giving the output of:

NOT_desired_output = {'a' : [NaN, 1, 1, NaN, 8, NaN, 3,3,3, NaN, 4]}

I could brute force it by iterating, but I'm sure there's a better solution. Any ideas?

Upvotes: 3

Views: 1850

Answers (1)

Alex Riley
Alex Riley

Reputation: 176740

This is an interesting operation. I can think of an alternative way to do it with replace.

To shift by 1 group:

>>> df['b'] = df.a.shift()
>>> x = df[df.a != df.b]
>>> df.replace(*x.values.T)

Which gives the DataFrame:

     a   b
0  NaN NaN
1  NaN NaN
2  NaN NaN
3    1 NaN
4    1   1
5    8   1
6    8   8
7    8   8
8    8   8
9    3   8
10   3   3

And we just want column a of this DataFrame:

desired_output_df = pd.DataFrame(_, columns=['a'])

To shift by more than one group, you just need to shift column b of x. If you want to shift by n groups, you need to shift x.b an additional n-1 times. Just insert the line

>>> x.b = x.b.shift(n-1)

after x = df[df.a != df.b] and then perform the df.replace(*x.values.T) step.

Upvotes: 2

Related Questions