Compute daily mean with pandas

Question

I have a function that will calculate the means of values in column "A" of a pandas dataframe that fall on the same date. Is there an approach to do this that is more efficient than the list comprehension in the last line?

import pandas as pd
import numpy as np

def daily_mean(df,date,col):
     return df[date][col].mean()

data = np.random.rand(100)
columns = ['A']
times = pd.date_range('20180101',freq='9H',periods=100)
df = pd.DataFrame(data=data,index=times,columns=columns)

dates = df.index.strftime('%Y%m%d').unique()
means = [daily_mean(df,date,'A') for date in dates]

cs95 · Accepted Answer

Yes, you can use the pd.Grouper API. Also, this will retain the DataFrame, so it's a win-win.

means = df.groupby(pd.Grouper(freq='1D')).mean()

Or, you can group on the floor of the datetime index.

means = df.groupby(df.index.floor('D')).mean()

means.head(10)
                   A
2018-01-01  0.412354
2018-01-02  0.301209
2018-01-03  0.710357
2018-01-04  0.158135
2018-01-05  0.720308
2018-01-06  0.281862
2018-01-07  0.489486
2018-01-08  0.287808
2018-01-09  0.463117
2018-01-10  0.512963

I prefer approach #1 since pd.Grouper should be more efficient than grouping on an unrelated series/array (as in the second form).

Compute daily mean with pandas

Answers (2)

Related Questions