Reputation: 2263
I have a dataframe with dates, id's and values.
For example:
date id value
2016-08-28 A 1
2016-08-28 B 1
2016-08-29 C 2
2016-09-02 B 0
2016-09-03 A 3
2016-09-06 C 1
2017-01-15 B 2
2017-01-18 C 3
2017-01-18 A 2
I want to apply a rolling mean by element, stating one after, so that the result would be like:
date id value rolling_mean
2016-08-28 A 1 NaN
2016-08-28 B 1 NaN
2016-08-29 C 2 NaN
2016-09-02 B 0 0.5
2016-09-03 A 3 2.0
2016-09-06 C 1 1.5
2017-01-15 B 2 1.0
2017-01-18 C 3 2.0
2017-01-18 A 2 2.5
The closest I've come to this was:
grouped = df.groupby(["id", "value"])
df["rolling_mean"] = grouped["value"].shift(1).rolling(window = 2).mean()
But this gives me the wrong values back, as it keeps the order with the remaining elements.
Any ideia?
Thank you in advance,
Upvotes: 1
Views: 1489
Reputation: 34046
Like this:
df['rolling_mean'] = df.groupby('id')['value'].rolling(2).mean().reset_index(0,drop=True).sort_index()
Output:
date id value rolling_mean
0 2016-08-28 A 1 nan
1 2016-08-28 B 1 nan
2 2016-08-29 C 2 nan
3 2016-09-02 B 0 0.50
4 2016-09-03 A 3 2.00
5 2016-09-06 C 1 1.50
6 2017-01-15 B 2 1.00
7 2017-01-18 C 3 2.00
8 2017-01-18 A 2 2.50
Upvotes: 1
Reputation: 150735
You can just groupby id
and use transform
:
df['rolling_mean'] = df.groupby('id')['value'].transform(lambda x: x.rolling(2).mean())
Output:
date id value rolling_mean
0 2016-08-28 A 1 NaN
1 2016-08-28 B 1 NaN
2 2016-08-29 C 2 NaN
3 2016-09-02 B 0 0.5
4 2016-09-03 A 3 2.0
5 2016-09-06 C 1 1.5
6 2017-01-15 B 2 1.0
7 2017-01-18 C 3 2.0
8 2017-01-18 A 2 2.5
Upvotes: 6
Reputation: 323226
Fix your code with groupby
with id
grouped = df.groupby(["id"])
df['rolling_mean']=grouped["value"].rolling(window = 2).mean().reset_index(level=0,drop=True)
df
Out[67]:
date id value rolling_mean
0 2016-08-28 A 1 NaN
1 2016-08-28 B 1 NaN
2 2016-08-29 C 2 NaN
3 2016-09-02 B 0 0.5
4 2016-09-03 A 3 2.0
5 2016-09-06 C 1 1.5
6 2017-01-15 B 2 1.0
7 2017-01-18 C 3 2.0
8 2017-01-18 A 2 2.5
Upvotes: 1