AJG519
AJG519

Reputation: 3379

Pandas Shift values within group for subset of groups

I am looking to shift values within group, but only for a subset of the groups in the Series.

I have something like the following Series:

import pandas as pd
df=pd.DataFrame()
df['Group']=['A','A','A','B','B','B','C','C','C']
df['Month']=[1,2,3,1,2,3,1,2,3]
df['Value']=1000,900,800,700,600,500,400,300,200
df=df.set_index(['Group','Month'])['Value']
df
Out[101]: 
Group  Month
A      1        1000
       2         900
       3         800
B      1         700
       2         600
       3         500
C      1         400
       2         300
       3         200
Name: Value, dtype: int64

I want group A to remain as it is, but groups B and C should have their values shifted up by one. I can do this across all groups (as described in this question as follows:

 df.groupby(level='Group').transform(lambda x:x.shift(-1))

However, to do this for only a subset of groups, I can't come up with a more elegant way than separating the series and then concatenating back together, as I show below:

df_a = df[df.index.get_level_values('Group')=='A']
df_other = df[df.index.get_level_values('Group')<>'A']

pd.concat([df_a,df_other.groupby(level='Group').transform(lambda x:x.shift(-1))])
Out[104]: 
Group  Month
A      1        1000.0
       2         900.0
       3         800.0
B      1         600.0
       2         500.0
       3           NaN
C      1         300.0
       2         200.0
       3           NaN
Name: Value, dtype: float64

Is there a more elegant solution to this?

Upvotes: 2

Views: 1425

Answers (1)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210822

In [361]: df.groupby(level='Group').transform(lambda x:x if x.name=='A' else x.shift(-1))
Out[361]:
Group  Month
A      1        1000.0
       2         900.0
       3         800.0
B      1         600.0
       2         500.0
       3           NaN
C      1         300.0
       2         200.0
       3           NaN
Name: Value, dtype: float64

or you can update only specific rows (solution provided by @cᴏʟᴅsᴘᴇᴇᴅ) - this will simplify solution dramatically:

In [363]: df.loc[['B', 'C']] = df.loc[['B', 'C']].groupby(level=0).shift(-1)

In [364]: df
Out[364]:
Group  Month
A      1        1000.0
       2         900.0
       3         800.0
B      1         600.0
       2         500.0
       3           NaN
C      1         300.0
       2         200.0
       3           NaN
Name: Value, dtype: float64

Upvotes: 3

Related Questions