Pandas: aggregating by different columns with MultiIndex columns

Question

I would like to take a dataframe with MultiIndex columns (where the index is a DatetimeIndex), and the aggregate by different functions depending on the column.

For example, consider the following table where index includes dates, first level of columns are Price and Volume, and second level of columns are tickers (e.g. AAPL and AMZN).

df1 = pd.DataFrame({"ticker":["AAPL"]*365, 
                'date': pd.date_range(start='20170101', end='20171231'), 
                'volume' : [np.random.randint(50,100) for i in range(365)],
                'price': [np.random.randint(100,200) for i in range(365)]}) 
df2 = pd.DataFrame({"ticker":["AMZN"]*365, 
                'date': pd.date_range(start='20170101', end='20171231'), 
                'volume' : [np.random.randint(50,100) for i in range(365)], 
                'price': [np.random.randint(100,200) for i in range(365)]})
df = pd.concat([df1,df2])

grp = df.groupby(['date', 'ticker']).mean().unstack()
grp.head()

What I would like to do is to aggregate the data by month, but taking the mean of price and sum of volume.

I would have thought that something along the lines of grp.resample("MS").agg({"price":"mean", "volume":"sum"}) should work, but it does not because of the multi-index column. What's the best way to accomplish this?

Pandas: aggregating by different columns with MultiIndex columns

Answers (1)

Related Questions