adrian leverkuhn
adrian leverkuhn

Reputation: 35

Pandas: aggregating by different columns with MultiIndex columns

I would like to take a dataframe with MultiIndex columns (where the index is a DatetimeIndex), and the aggregate by different functions depending on the column.

For example, consider the following table where index includes dates, first level of columns are Price and Volume, and second level of columns are tickers (e.g. AAPL and AMZN).

df1 = pd.DataFrame({"ticker":["AAPL"]*365, 
                'date': pd.date_range(start='20170101', end='20171231'), 
                'volume' : [np.random.randint(50,100) for i in range(365)],
                'price': [np.random.randint(100,200) for i in range(365)]}) 
df2 = pd.DataFrame({"ticker":["AMZN"]*365, 
                'date': pd.date_range(start='20170101', end='20171231'), 
                'volume' : [np.random.randint(50,100) for i in range(365)], 
                'price': [np.random.randint(100,200) for i in range(365)]})
df = pd.concat([df1,df2])

grp = df.groupby(['date', 'ticker']).mean().unstack()
grp.head()

enter image description here

What I would like to do is to aggregate the data by month, but taking the mean of price and sum of volume.

I would have thought that something along the lines of grp.resample("MS").agg({"price":"mean", "volume":"sum"}) should work, but it does not because of the multi-index column. What's the best way to accomplish this?

Upvotes: 1

Views: 668

Answers (1)

BENY
BENY

Reputation: 323246

You can

df.groupby([pd.to_datetime(df.date).dt.strftime('%Y-%m'),df.ticker]).\
    agg({"price":"mean", "volume":"sum"}).unstack()
Out[529]: 
              price             volume      
ticker         AAPL        AMZN   AAPL  AMZN
date                                        
2017-01  155.548387  141.580645   2334  2418
2017-02  154.035714  156.821429   2112  2058
2017-03  154.709677  148.806452   2258  2188
2017-04  154.366667  149.366667   2271  2254
2017-05  154.774194  155.096774   2331  2264
2017-06  147.333333  145.133333   2220  2302
2017-07  149.709677  150.645161   2188  2412
2017-08  150.806452  154.645161   2265  2341
2017-09  157.033333  151.466667   2199  2232
2017-10  149.387097  145.580645   2303  2203
2017-11  154.100000  150.266667   2212  2275
2017-12  156.064516  149.290323   2265  2224

Upvotes: 2

Related Questions