Pandas DateTimeIndex - create a new value which has the max value of previous month

Question

I have something like the following dataframe (notice dt is the index)

            fx          fy
dt      
2019-05-29  0.000000    0.000000
2019-05-30  65.410004   156.449997
2019-05-31  70.279999   125.040001
2019-06-01  49.220001   147.979996
2019-06-02  100.580002  232.539993
2019-06-03  262.230011  468.809998
2019-06-04  383.779999  525.390015
2019-06-05  761.609985  1147.380005
2019-06-06  1060.750000 1727.380005
2019-06-07  1640.300049 2827.120117

What I want to achieve is the have a new column named fz where each day's value, is the previous month's max value of fy - so the result would be

            fx          fy          fz
dt          
2019-05-29  0.000000    0.000000    NaN
2019-05-30  65.410004   156.449997  NaN
2019-05-31  70.279999   125.040001  NaN
2019-06-01  49.220001   147.979996  156.449997
2019-06-02  100.580002  232.539993  156.449997
2019-06-03  262.230011  468.809998  156.449997
2019-06-04  383.779999  525.390015  156.449997
2019-06-05  761.609985  1147.380005 156.449997
2019-06-06  1060.750000 1727.380005 156.449997
2019-06-07  1640.300049 2827.120117 156.449997

The first month's fz is empty because there is no previous month. I tried a combination of pd.Grouper(freq='M') with .transform() and .shift(-1, freq='M') but failed miserably as it changed the index entirely, and I would like to keep the index as is.

How can I solve this for arbitrary N months back?

jezrael · Accepted Answer

Use DatetimeIndex.to_period for month period with shifting and mapping by Index.map:

#changed datetimeindex
print (df)
                     fx           fy
dt                                  
2019-05-29     0.000000     0.000000
2019-05-30    65.410004   156.449997
2019-05-31    70.279999   125.040001
2019-06-01    49.220001   147.979996
2019-06-02   100.580002   232.539993
2019-07-03   262.230011   468.809998
2019-07-04   383.779999   525.390015
2019-08-05   761.609985  1147.380005
2019-08-06  1060.750000  1727.380005
2019-09-07  1640.300049  2827.120117

N = 2
s = df.index.to_period('m')
df['fz'] = s.map(df.groupby(s)['fy'].max().shift(N))
print (df)
                     fx           fy          fz
dt                                              
2019-05-29     0.000000     0.000000         NaN
2019-05-30    65.410004   156.449997         NaN
2019-05-31    70.279999   125.040001         NaN
2019-06-01    49.220001   147.979996         NaN
2019-06-02   100.580002   232.539993         NaN
2019-07-03   262.230011   468.809998  156.449997
2019-07-04   383.779999   525.390015  156.449997
2019-08-05   761.609985  1147.380005  232.539993
2019-08-06  1060.750000  1727.380005  232.539993
2019-09-07  1640.300049  2827.120117  525.390015

Solution if datetimes are not conecutive, missing some months with add N to PeriodIndex by rename:

print (df)
                     fx           fy
dt                                  
2019-05-29     0.000000     0.000000
2019-05-30    65.410004   156.449997
2019-05-31    70.279999   125.040001
2019-06-01    49.220001   147.979996
2019-06-02   100.580002   232.539993
2019-08-03   262.230011   468.809998
2019-08-04   383.779999   525.390015
2019-09-05   761.609985  1147.380005
2019-09-06  1060.750000  1727.380005
2019-09-07  1640.300049  2827.120117

N = 1
s = df.index.to_period('m')
df['fz'] = s.map(df.groupby(s)['fy'].max().rename(lambda x: x + N))
print (df)
                     fx           fy          fz
dt                                              
2019-05-29     0.000000     0.000000         NaN
2019-05-30    65.410004   156.449997         NaN
2019-05-31    70.279999   125.040001         NaN
2019-06-01    49.220001   147.979996  156.449997
2019-06-02   100.580002   232.539993  156.449997
2019-08-03   262.230011   468.809998         NaN
2019-08-04   383.779999   525.390015         NaN
2019-09-05   761.609985  1147.380005  525.390015
2019-09-06  1060.750000  1727.380005  525.390015
2019-09-07  1640.300049  2827.120117  525.390015

Pandas DateTimeIndex - create a new value which has the max value of previous month

Answers (2)

Related Questions