Vadim
Vadim

Reputation: 4529

pandas: Get an average value of sales for a day of week based on 4 past weeks

I'm stacked, and I need the wisdom of the stackoverflow.

I want to create a simple averaging model for sales based on the day of the week from past using pandas.

I mean, the model predicts value of Sales for next Monday based on the last 4 Mondays. Also, if there are no enough weeks in the past, just make an average of existing weeks.

Sample of my time-series df:

     weekDay weekNum Sales 
Date            
2013-01-01  2   1   22    
2013-01-02  3   1   33    
2013-01-03  4   1   44    
2013-01-04  5   1   55    
2013-01-05  6   1   66    
2013-01-06  7   2   76    
2013-01-07  1   2   23    
2013-01-08  2   2   55    
2013-01-09  3   2   34    
2013-01-10  4   2   43    
2013-01-11  5   2   34    
2013-01-12  6   2   53    
2013-01-13  7   3   52    
2013-01-14  1   3   41    
2013-01-15  2   3   31    
2013-01-16  3   3   31    
2013-01-17  4   3   42    
2013-01-18  5   3   23    
2013-01-19  6   3   41    
2013-01-20  7   4   31    
2013-01-21  1   4   31    
2013-01-22  2   4   31    
2013-01-23  3   4   43    
2013-01-24  4   4   53    
2013-01-25  5   4   32    
2013-01-26  6   4   12    
2013-01-27  7   5   41    
2013-01-28  1   5   41    
2013-01-29  2   5   12    
2013-01-30  3   5   76    
2013-01-31  4   5   43    
2013-02-01  5   5   32    
2013-02-02  6   5   54    
2013-02-03  7   6   43    
2013-02-04  1   6   43    
2013-02-05  2   6   12    
2013-02-06  3   6   12    
2013-02-07  4   6   43    
2013-02-08  5   6   22    
2013-02-09  6   6   12    

What I've tried:

ts.resample('D').rolling(28).mean()["Sales"]
ts.resample('W-MON').rolling(1).mean()["Sales"]

but it didn't help.

Upvotes: 4

Views: 4281

Answers (1)

talz
talz

Reputation: 1200

Ok, Sorry, I think I understand now what you are looking for:

res = df.join(df.groupby('weekDay').rolling('28d').mean(), rsuffix='_avg')

so now res contains:

                   weekDay weekNum  Sales  weekDay_avg  weekNum_avg  Sales_avg
weekDay date                                                                  
1       2013-01-07       1       2     23          1.0          2.0  23.000000
        2013-01-14       1       3     41          1.0          2.5  32.000000
        2013-01-21       1       4     31          1.0          3.0  31.666667
        2013-01-28       1       5     41          1.0          3.5  34.000000
        2013-02-04       1       6     43          1.0          4.5  39.000000
2       2013-01-01       2       1     22          2.0          1.0  22.000000
        2013-01-08       2       2     55          2.0          1.5  38.500000
        2013-01-15       2       3     31          2.0          2.0  36.000000
        2013-01-22       2       4     31          2.0          2.5  34.750000
        2013-01-29       2       5     12          2.0          3.5  32.250000
        2013-02-05       2       6     12          2.0          4.5  21.500000
3       2013-01-02       3       1     33          3.0          1.0  33.000000
        2013-01-09       3       2     34          3.0          1.5  33.500000
        2013-01-16       3       3     31          3.0          2.0  32.666667
        2013-01-23       3       4     43          3.0          2.5  35.250000
        2013-01-30       3       5     76          3.0          3.5  46.000000
        2013-02-06       3       6     12          3.0          4.5  40.500000
4       2013-01-03       4       1     44          4.0          1.0  44.000000
        2013-01-10       4       2     43          4.0          1.5  43.500000
        2013-01-17       4       3     42          4.0          2.0  43.000000
        2013-01-24       4       4     53          4.0          2.5  45.500000
        2013-01-31       4       5     43          4.0          3.5  45.250000
        2013-02-07       4       6     43          4.0          4.5  45.250000
5       2013-01-04       5       1     55          5.0          1.0  55.000000
        2013-01-11       5       2     34          5.0          1.5  44.500000
        2013-01-18       5       3     23          5.0          2.0  37.333333
        2013-01-25       5       4     32          5.0          2.5  36.000000
        2013-02-01       5       5     32          5.0          3.5  30.250000
        2013-02-08       5       6     22          5.0          4.5  27.250000
6       2013-01-05       6       1     66          6.0          1.0  66.000000
        2013-01-12       6       2     53          6.0          1.5  59.500000
        2013-01-19       6       3     41          6.0          2.0  53.333333
        2013-01-26       6       4     12          6.0          2.5  43.000000
        2013-02-02       6       5     54          6.0          3.5  40.000000
        2013-02-09       6       6     12          6.0          4.5  29.750000
7       2013-01-06       7       2     76          7.0          2.0  76.000000
        2013-01-13       7       3     52          7.0          2.5  64.000000
        2013-01-20       7       4     31          7.0          3.0  53.000000
        2013-01-27       7       5     41          7.0          3.5  50.000000
        2013-02-03       7       6     43          7.0          4.5  41.750000

and in order to make it similar to your good ol' DataFrame that we began with:

res.drop(['weekDay', 'weekDay_avg', 'weekNum_avg'], axis=1, inplace=True)
res.reset_index(inplace=True)
res.set_index('date', inplace=True)
res.sort_index(inplace=True)

so now it looks like this:

           weekDay weekNum  Sales  Sales_avg
date                                        
2013-01-01       2       1     22  22.000000
2013-01-02       3       1     33  33.000000
2013-01-03       4       1     44  44.000000
2013-01-04       5       1     55  55.000000
2013-01-05       6       1     66  66.000000
2013-01-06       7       2     76  76.000000
2013-01-07       1       2     23  23.000000
2013-01-08       2       2     55  38.500000
2013-01-09       3       2     34  33.500000
2013-01-10       4       2     43  43.500000
2013-01-11       5       2     34  44.500000
2013-01-12       6       2     53  59.500000
2013-01-13       7       3     52  64.000000
2013-01-14       1       3     41  32.000000
2013-01-15       2       3     31  36.000000
2013-01-16       3       3     31  32.666667
2013-01-17       4       3     42  43.000000
2013-01-18       5       3     23  37.333333
2013-01-19       6       3     41  53.333333
2013-01-20       7       4     31  53.000000
2013-01-21       1       4     31  31.666667
2013-01-22       2       4     31  34.750000
2013-01-23       3       4     43  35.250000
2013-01-24       4       4     53  45.500000
2013-01-25       5       4     32  36.000000
2013-01-26       6       4     12  43.000000
2013-01-27       7       5     41  50.000000
2013-01-28       1       5     41  34.000000
2013-01-29       2       5     12  32.250000
2013-01-30       3       5     76  46.000000
2013-01-31       4       5     43  45.250000
2013-02-01       5       5     32  30.250000
2013-02-02       6       5     54  40.000000
2013-02-03       7       6     43  41.750000
2013-02-04       1       6     43  39.000000
2013-02-05       2       6     12  21.500000
2013-02-06       3       6     12  40.500000
2013-02-07       4       6     43  45.250000
2013-02-08       5       6     22  27.250000
2013-02-09       6       6     12  29.750000

Hope this time this really helps (:

Upvotes: 3

Related Questions