Classifying items in a panel dataframe based on their average value over time

Question

I have a multi-index dataframe of panel data that shows historical prices over time. What I basically want to do is group them by their average forecast over time- high (above 0.4), medium (between 0.2 and 0.4), and low (below 0.2).

                      price   yield   forecast  

 date       assetid                             
 1/1/2017   4200      96.44    0.23       0.64  
 1/1/2017   408        46.3     0.6        0.4  
 1/1/2017   413       50.68    0.47       0.73  
 1/1/2017   3911      82.48    0.33       0.84  
 1/1/2017   7392      97.24     0.4       0.62  
 1/1/2017   7144      31.86    0.18       0.54  
 1/1/2017   8793      59.66    0.65        0.9  
 1/2/2017   4200       57.1    0.21       0.69  
 1/2/2017   408        4.76    0.86       0.16  
 1/2/2017   413       70.79    0.24       0.12  
 1/2/2017   3911       5.43    0.91       0.44  
 1/2/2017   7392      47.33    0.51       0.18  
 1/2/2017   7144      17.85    0.79       0.59  
 1/2/2017   8793      98.08     0.2       0.24

So the first step I need to figure out is how to create a table that returns the mean value of each asset.

  assetid   Average of forecast  
     408               0.73  
     413              0.355  
    3911               0.62  
    4200               0.22  
    7144              0.485  
    7392              0.455  
    8793              0.425

Then, I have to just join this newly created table and match on the assetid, with the same classification over time.

                      price   yield   forecast   type  
 date       assetid                                    
 1/1/2017   4200      96.44    0.23       0.64   med   
 1/1/2017   408        46.3     0.6        0.4   high  
 1/1/2017   413       50.68    0.47       0.73   med   
 1/1/2017   3911      82.48    0.33       0.84   med   
 1/1/2017   7392      97.24     0.4       0.62   high  
 1/1/2017   7144      31.86    0.18       0.54   low   
 1/1/2017   8793      59.66    0.65        0.9   high  
 1/2/2017   4200       57.1    0.21       0.69   med   
 1/2/2017   408        4.76    0.86       0.16   high  
 1/2/2017   413       70.79    0.24       0.12   med   
 1/2/2017   3911       5.43    0.91       0.44   med   
 1/2/2017   7392      47.33    0.51       0.18   high  
 1/2/2017   7144      17.85    0.79       0.59   low   
 1/2/2017   8793      98.08     0.2       0.24   high

BENY · Accepted Answer

pd.cut + transform

df['type']=pd.cut(df.groupby(level='assetid')
               .forecast
                  .transform('mean'),[0,0.2,0.4,np.inf],labels=['low','med','high'])
df
Out[663]: 
                  price  yield  forecast  type
date     assetid                              
1/1/2017 4200     96.44   0.23      0.64  high
         408      46.30   0.60      0.40   med
         413      50.68   0.47      0.73  high
         3911     82.48   0.33      0.84  high
         7392     97.24   0.40      0.62   med
         7144     31.86   0.18      0.54  high
         8793     59.66   0.65      0.90  high
1/2/2017 4200     57.10   0.21      0.69  high
         408       4.76   0.86      0.16   med
         413      70.79   0.24      0.12  high
         3911      5.43   0.91      0.44  high
         7392     47.33   0.51      0.18   med
         7144     17.85   0.79      0.59  high
         8793     98.08   0.20      0.24  high

Classifying items in a panel dataframe based on their average value over time

Answers (2)

Related Questions