Reputation: 1358
I've created a convenience method to perform resampling on an arbitrary dataframe:
def resample_data_to_hourly(df):
df = df.resample('1H',how='mean',fill_method='ffill',
closed='left',label='left')
return df
And I would like to apply this function to every dataframe in a groupby object with something like the following:
df.transform(resample_data_to_hourly)
df.aggregate(resample_data_to_hourly)
dfapply(resample_data_to_hourly)
I've tried them all with no success. No matter what I do, no effect is had on the dataframe, even if I set the resulting value of the above to a new dataframe (which, to my understanding, I shouldn't have to do).
I'm sure there is something straightforward and idiomatic about handling groupby objects with time series data that I am missing here, but I haven't been able to correct my program.
How do I create functions like the above and have them properly apply to a groupby object? I can get my code to work if I iterate through each group as in a dictionary and add the results to a new dictionary which I can then convert back into a groupby object, but this is terribly hacky and I feel like I'm missing out on a lot of what Pandas can do because I'm forced into these hacky methods.
EDIT ADDING BASE EXAMPLE:
rng = pd.date_range('1/1/2000', periods=10, freq='10m')
df = pd.DataFrame({'a':pd.Series(randn(len(rng)), index=rng), 'b':pd.Series(randn(len(rng)), index=rng)})
yields:
a b
2000-01-31 0.168622 0.539533
2000-11-30 -0.283783 0.687311
2001-09-30 -0.266917 -1.511838
2002-07-31 -0.759782 -0.447325
2003-05-31 -0.110677 0.061783
2004-03-31 0.217771 1.785207
2005-01-31 0.450280 1.759651
2005-11-30 0.070834 0.184432
2006-09-30 0.254020 -0.895782
2007-07-31 -0.211647 -0.072757
df.groupby('a').transform(hour_resample) // should yield resampled data with both a and b columns
// instead yields only column b
// df.apply yields both columns but in this case no changes will be made to the actual matrix
// (though in this case no change would be made, sample data could be generated such that a change should be made)
// if someone could supply a reliable way to generate data that can be resampled, that would be wonderful
Upvotes: 1
Views: 1395
Reputation: 1834
data.groupby(level=0)
.apply(lambda d: d.reset_index(level=0, drop=True)
.resample("M", how=""))
Upvotes: 3