Bowen Peng
Bowen Peng

Reputation: 1815

How to shift values in pandas dataframe based on some columns in different day

I get dataframe as follows:

df
   adcode   model   day   label
   1        0       3     103
   1        1       2     112
   2        2       3     223
   2        2       4     224
   2        2       1     221
   1        0       2     102

# after shift
df
   adcode   model   day   label   next_label   prev_label
   1        0       2     102     103          nan
   1        0       3     103     nan          102
   1        1       2     112     nan          nan
   2        2       1     221     nan          nan
   2        2       3     223     224          nan
   2        2       4     224     nan          223

I want to shift the label of adcode and model in the next or previous day as a new column.
I have tried to df = df.sort_values(by=['adcode', 'model', 'day']) first and use groupby and shift to get next or previous label.
But I was stuck at it. I only know how to shift the label based on ['adcode', 'model', 'day'] like groupby(['adcode', 'model', 'day'])['label'].shift(), but I don't know how shift the label based on ['adcode', 'model'] in different day.
If not mind could anyone help me?
Thanks in advances.

Upvotes: 1

Views: 77

Answers (1)

jezrael
jezrael

Reputation: 862611

Idea is reshape DataFrame for all days in columns, so possible shift, then concat both Series and DataFrame.join to original:

df = df.sort_values(by=['adcode', 'model', 'day'])
df1 = df.set_index(['adcode', 'model', 'day'])['label'].unstack()

new = pd.concat([df1.shift(-1, axis=1).stack(), 
                 df1.shift(axis=1).stack()], 
                 keys=['next_label','prev_label'], axis=1)
df = df.join(new, on=['adcode', 'model', 'day'])
print (df)
   adcode  model  day  label  next_label  prev_label
5       1      0    2    102       103.0         NaN
0       1      0    3    103         NaN       102.0
1       1      1    2    112         NaN         NaN
4       2      2    1    221         NaN         NaN
2       2      2    3    223       224.0         NaN
3       2      2    4    224         NaN       223.0

Detail:

print (df1)
day               1      2      3      4
adcode model                            
1      0        NaN  102.0  103.0    NaN
       1        NaN  112.0    NaN    NaN
2      2      221.0    NaN  223.0  224.0

Upvotes: 2

Related Questions