GNMO11
GNMO11

Reputation: 2259

Pandas compute hourly average

I have a dataframe where time is a float relevant to the dataset:

 Time            Value
-47.88333         90
-46.883333        23
-45.900000        66
-45.883333        87
-45.383333        43

The time column ranges from -48 to 0. What I would like to do is compute the average value on every half hour from -47.5 through -.5. For example:

-47.5 would be the average of all values that fell between -48 and -47 and -46.5 would be the average of all values that fell between -47 and -46. If there is no value I would like to carry forward the previous average.

Resulting in an output that would look like:

 Time            Value
-47.5             90
-46.5             23
-45.5             65.33
-44.5             65.33
-43.5             65.33

Will this need to be a custom function since the time column is not a datetime object?

Upvotes: 3

Views: 1736

Answers (2)

Randy
Randy

Reputation: 14857

You can do this with a groupby pretty easily:

(df.groupby(df.Time.apply(lambda x: np.floor(x) + 0.5))
   .mean()
   .Value
   .reindex(np.arange(-47.5, -42.5))
   .ffill())

Time
-47.5    90.000000
-46.5    23.000000
-45.5    65.333333
-44.5    65.333333
-43.5    65.333333
Name: Value, dtype: float64

Upvotes: 3

Sam
Sam

Reputation: 4090

Try binning the time variables with pd.cut:

#change the bins arg to modify the size of the bins
df.loc[:, 'TimeBin'] = pd.cut(df.Time, bins=[i for i in range (-48, 0)])
#groupby the time bin and take the mean:
df[['TimeBin', 'Value']].groupby('TimeBin').mean()

Upvotes: 2

Related Questions