Reputation: 17
I want to be able to perform a rolling average on the Data column based on values which are labeled as True in a separate column. Below is an example data set.
Data Condition
2020-01-01 0:00 2.0 False
2020-01-01 0:01 2.0 False
2020-01-01 0:02 732.1 True
2020-01-01 0:03 734.2 True
2020-01-01 0:04 733.3 True
2020-01-01 0:05 731.2 True
2020-01-01 0:06 2.0 False
2020-01-01 0:07 2.0 False
...
2020-06-15 23:52 2.0 False
2020-06-15 23:53 732.1 True
2020-06-15 23:54 734.2 True
2020-06-15 23:55 733.3 True
2020-06-15 23:56 731.2 True
2020-06-15 23:57 2.0 False
2020-06-15 23:58 2.0 False
2020-06-15 23:59 2.0 False
My expected output, with a window size of 3, for the data column, would be the following:
[2.0, 2.0, 733.15, 733.2, 732.9, 732.25, 2, 2, ..., 2.0, 733.15, 733.2, 732.9, 732.25, 2.0, 2.0, 2.0].
So essentially, for each row, it would take the average of the row above and below that row, as well as the row in question, but always excluding the numbers when it is false.
Here is some code I found, except for that it is not for the current pandas.
pandas.rolling_mean(df[df.Condition], window=30)
I found it here: Conditional mean by rolling
Any help or information on this would be greatly appreciated!
Upvotes: 0
Views: 381
Reputation: 8768
I believe you would filter your dataframe to only get the rows that fit your condition, then apply a rolling mean.
df['rolling'] = df.loc[df['Condition'] == True]['Data'].rolling(30).mean().fillna(0)
Upvotes: 1