Reputation: 33
Given the following data:
data = {'a' : [1,1,1,8,8,3,3,3,3,4,4] }
df = pd.DataFrame(data)
I would now like to shift the whole thing down by n groups, so that their current order is preserved. The desired output for a shift of n=1 would be:
desired_output = {'a': [NaN,NaN,NaN,1,1,8,8,8,8,3,3] }
desired_output_df = pd.DataFrame(desired_output)
a shift of n=2 should be:
desired_output = {'a': [NaN,NaN,NaN,NaN,NaN,1,1,1,1,8,8] }
desired_output_df = pd.DataFrame(desired_output)
I have been messing around with groupby/transform/apply but haven't gotten anything to work so far. If I groupby and then shift, it shifts each group giving the output of:
NOT_desired_output = {'a' : [NaN, 1, 1, NaN, 8, NaN, 3,3,3, NaN, 4]}
I could brute force it by iterating, but I'm sure there's a better solution. Any ideas?
Upvotes: 3
Views: 1850
Reputation: 176740
This is an interesting operation. I can think of an alternative way to do it with replace
.
To shift by 1 group:
>>> df['b'] = df.a.shift()
>>> x = df[df.a != df.b]
>>> df.replace(*x.values.T)
Which gives the DataFrame:
a b
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 1 NaN
4 1 1
5 8 1
6 8 8
7 8 8
8 8 8
9 3 8
10 3 3
And we just want column a
of this DataFrame:
desired_output_df = pd.DataFrame(_, columns=['a'])
To shift by more than one group, you just need to shift column b
of x
. If you want to shift by n
groups, you need to shift x.b
an additional n-1
times. Just insert the line
>>> x.b = x.b.shift(n-1)
after x = df[df.a != df.b]
and then perform the df.replace(*x.values.T)
step.
Upvotes: 2