Reputation: 41

How to calculate previous n days mean using pandas?

I want to calculate like previous 10 days' means for each day.

For example, in the result table, in column A, '1/11/2000' shows 44, which is the average of A values from '1/1/2000' to '1/10/2000'.

Raw Data:

              A      B       C
1/1/2000     60      62      88 
1/2/2000     46      99      28 
1/3/2000     20      23      94 
1/4/2000     28      19      79 
1/5/2000     58      45      12 
1/6/2000     50      46      62 
1/7/2000     68      4       55 
1/8/2000     54      64      79 
1/9/2000     26      41      63 
1/10/2000    33      10      18 
1/11/2000    37      82      73 
1/12/2000    67      33      29 
1/13/2000    2       82      17 
1/14/2000    82      74      51 
1/15/2000    9       46      81 
1/16/2000    72      84      70 
1/17/2000    74      77      100 
1/18/2000    19      88      37

Result:

               A      B      C
1/1/2000            
1/2/2000            
1/3/2000            
1/4/2000            
1/5/2000            
1/6/2000            
1/7/2000            
1/8/2000            
1/9/2000            
1/10/2000           
1/11/2000    44      41      58 
1/12/2000    42      43      56 
1/13/2000    44      37      56 
1/14/2000    42      43      49 
1/15/2000    48      48      46 
1/16/2000    43      48      53 
1/17/2000    45      52      54 
1/18/2000    46      59      58

Upvotes: 4

Answers (2)

piRSquared

Reputation: 294258

Warning/Caveat

numpy often (not always) provides more performant solutions. However, they are also less intuitive and less flexible. I'm offering this solution to provide useful information to the community. I wouldn't recommend this to someone just getting familiar with pandas and numpy. I suggest you read @Jeff's comments below as well.

`numpy` using `as_strided`

import pandas as pd
import numpy as np
from numpy.lib.stride_tricks import as_strided as stride

v = df.values
n, m = v.shape
s1, s2 = v.strides

# note that `np.nanmean` is used to address potential nan values
pd.DataFrame(
    np.nanmean(stride(v, (n - 9, 10, m), (s1, s1, s2)), 1).round(),
    df.index[9:], df.columns
)

              A     B     C
1/10/2000  44.0  41.0  58.0
1/11/2000  42.0  43.0  56.0
1/12/2000  44.0  37.0  56.0
1/13/2000  42.0  43.0  49.0
1/14/2000  48.0  48.0  46.0
1/15/2000  43.0  48.0  53.0
1/16/2000  45.0  52.0  54.0
1/17/2000  46.0  59.0  58.0
1/18/2000  42.0  62.0  54.0

time test

Upvotes: 0

akuiper

Reputation: 214957

You can use rolling.mean() with a shift:

df.rolling(window = 10).mean().applymap(round).shift()

Upvotes: 4

How to calculate previous n days mean using pandas?

Answers (2)

Warning/Caveat

numpy using as_strided

Related Questions

`numpy` using `as_strided`