Reputation: 2259
I have a dataframe where time is a float relevant to the dataset:
Time Value
-47.88333 90
-46.883333 23
-45.900000 66
-45.883333 87
-45.383333 43
The time column ranges from -48 to 0. What I would like to do is compute the average value on every half hour from -47.5 through -.5. For example:
-47.5 would be the average of all values that fell between -48 and -47 and -46.5 would be the average of all values that fell between -47 and -46. If there is no value I would like to carry forward the previous average.
Resulting in an output that would look like:
Time Value
-47.5 90
-46.5 23
-45.5 65.33
-44.5 65.33
-43.5 65.33
Will this need to be a custom function since the time column is not a datetime object?
Upvotes: 3
Views: 1736
Reputation: 14857
You can do this with a groupby pretty easily:
(df.groupby(df.Time.apply(lambda x: np.floor(x) + 0.5))
.mean()
.Value
.reindex(np.arange(-47.5, -42.5))
.ffill())
Time
-47.5 90.000000
-46.5 23.000000
-45.5 65.333333
-44.5 65.333333
-43.5 65.333333
Name: Value, dtype: float64
Upvotes: 3
Reputation: 4090
Try binning the time variables with pd.cut
:
#change the bins arg to modify the size of the bins
df.loc[:, 'TimeBin'] = pd.cut(df.Time, bins=[i for i in range (-48, 0)])
#groupby the time bin and take the mean:
df[['TimeBin', 'Value']].groupby('TimeBin').mean()
Upvotes: 2