Chris
Chris

Reputation: 433

Computing historical average for panel data

Imagine that you have the following dataframe: Now, what we want to compute is the INDUSTRY average historical ROE, and place this number in a new column.

Date        ROE   id 
30-06-2013  2%    1
30-06-2013  1%    2        
30-09-2013  5%    1
30-09-2013  6%    2
30-09-2013  4%    3        
31-12-2013  10%   1
31-12-2013  5%    2
31-12-2013  3%    3    
31-03-2014  11%   1
31-03-2014  6%    2
31-03-2014  4%    3

When computing the historical ROE we want to rely on a minimum of 4 different dates. The resulting dataframe should look the following:

Date        ROE   id Hist. avg. ROE 
30-06-2013  2%    1  NaN
30-06-2013  1%    2  NaN      
30-09-2013  5%    1  NaN
30-09-2013  6%    2  NaN
30-09-2013  4%    3  NaN      
31-12-2013  10%   1  NaN
31-12-2013  5%    2  NaN
31-12-2013  3%    3  NaN
31-03-2014  11%   1  5.2%
31-03-2014  6%    2  5.2%
31-03-2014  4%    3  5.2%

Upvotes: 0

Views: 285

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150765

IIUC, you want rolling().mean():

df['hist'] = (df.groupby('id', as_index=False)
                .ROE.rolling(4).mean()
                .reset_index(level=0, drop=True)
             )

Output:

          Date   ROE  id   hist
0   30-06-2013  0.02   1    NaN
1   30-06-2013  0.01   2    NaN
2   30-09-2013  0.05   1    NaN
3   30-09-2013  0.06   2    NaN
4   30-09-2013  0.04   3    NaN
5   31-12-2013  0.10   1    NaN
6   31-12-2013  0.05   2    NaN
7   31-12-2013  0.03   3    NaN
8   31-03-2014  0.11   1  0.070
9   31-03-2014  0.06   2  0.045
10  31-03-2014  0.04   3    NaN

Update: Per comment, you may want to aggregate on date and then rolling on the result.

df['Date'] = pd.to_datetime(df['Date'])

new_df = (df.groupby('Date').ROE.agg(['sum','count'])
            .rolling(4).sum()
            .assign(hist=lambda x: x['sum']/x['count'])
         )

df['hist'] = df['Date'].map(new_df['hist'])

Output:

         Date   ROE  id      hist
0  2013-06-30  0.02   1       NaN
1  2013-06-30  0.01   2       NaN
2  2013-09-30  0.05   1       NaN
3  2013-09-30  0.06   2       NaN
4  2013-09-30  0.04   3       NaN
5  2013-12-31  0.10   1       NaN
6  2013-12-31  0.05   2       NaN
7  2013-12-31  0.03   3       NaN
8  2014-03-31  0.11   1  0.051818
9  2014-03-31  0.06   2  0.051818
10 2014-03-31  0.04   3  0.051818

Upvotes: 1

Related Questions