Reputation: 649
I have a df presented below:
Added code for dataframe:
df = pd.DataFrame({'Name':['John']*12
,'Type':[*'AABBBBAAABAB']
,'Period':['Noon','Morning']*6
,'Date':['1/1/2020']*2+['1/4/2020']*2+
['1/5/2020']*2+['1/2/2020']*2+
['1/3/2020']*2+['1/19/2020', '1/31/2020']
,'value':[1,2,7,8,9,10,3,4,5,6,5,6]}, index=range(1,13))
Basically I need 2 consecutive days moving average on a person's (John as an example) value, classified by type and period (person too, but not in this table).
Expected result as shown last column.
eg.1. row 4, John's B type in Morning, average of value on 1/4/2020 and 1/3/2020 (row10) is (6+8)/2=7
,
and no data prior to 1/3/2020 of John's type B, so row10 result is na
eg.2. row 11 and 12, they are not consecutive date, so both are na
How do I use pandas to realise this function? should I use transform
or rolling
?
Upvotes: 1
Views: 230
Reputation: 153460
You can use the following:
df.set_index('Date')\
.groupby(['Name','Type','Period'])['value']\
.rolling('2D', min_periods=2)\
.mean()\
.reset_index()
Output:
Name Type Period Date value
0 John A Morning 2020-01-01 NaN
1 John A Morning 2020-01-02 3.0
2 John A Noon 2020-01-01 NaN
3 John A Noon 2020-01-02 2.0
4 John A Noon 2020-01-03 4.0
5 John A Noon 2020-01-19 NaN
6 John B Morning 2020-01-04 NaN
7 John B Morning 2020-01-05 9.0
8 John B Morning 2020-01-03 8.0
9 John B Morning 2020-01-31 NaN
10 John B Noon 2020-01-04 NaN
11 John B Noon 2020-01-05 8.0
And, one way to merge it back to your original dataframe:
df_er = (df.set_index('Date')
.groupby(['Name','Type','Period'])['value']
.rolling('2D', min_periods=2)
.mean()
.rename('Expected Result'))
df.merge(df_er, left_on=['Name', 'Type', 'Period', 'Date'], right_index=True)
Output:
Name Type Period Date value Expected Result
1 John A Noon 2020-01-01 1 NaN
2 John A Morning 2020-01-01 2 NaN
3 John B Noon 2020-01-04 7 NaN
4 John B Morning 2020-01-04 8 NaN
5 John B Noon 2020-01-05 9 8.0
6 John B Morning 2020-01-05 10 9.0
7 John A Noon 2020-01-02 3 2.0
8 John A Morning 2020-01-02 4 3.0
9 John A Noon 2020-01-03 5 4.0
10 John B Morning 2020-01-03 6 8.0
11 John A Noon 2020-01-19 5 NaN
12 John B Morning 2020-01-31 6 NaN
Upvotes: 2