Reputation: 514
My original dataframe looks like this :
A B C
27/03/2018 u1 Yes
28/03/2018 u2 Yes
02/04/2018 u2 No
04/04/2018 u1 Yes
05/04/2018 u1 No
10/04/2018 u2 Yes
11/04/2018 u1 No
16/04/2018 u2 Yes
13/05/2018 u2 No
I'd like to get a new column that does this : in each row, it will say how many "Yes" were found for the item in B during the 7 prior days to the actual date. As such, in this example, it would look like this :
A B C D
27/03/2018 u1 Yes 0
28/03/2018 u2 Yes 0
02/04/2018 u2 No 1
04/04/2018 u1 Yes 1
05/04/2018 u1 No 1
10/04/2018 u2 Yes 0
11/04/2018 u1 No 1
16/04/2018 u2 Yes 1
13/05/2018 u2 No 0
I would like to get another column that would do the same but with the prior 30 days. I guess that figuring out how to do it for 7 days would make the rest easier.
P.S : I would also like to do the same but with the actual weeks instead of previous days (as such, for friday 01/06/2018, we would consider the week from 21 to 27 May).
How can this be achieved ?
Upvotes: 0
Views: 179
Reputation: 863791
IIUC need:
def f(x):
d = x['A'] - pd.Timedelta(7, unit='d')
a = (df.A.between(d, x.A) & (df.B == x.B) & (df.C == 'Yes'))
#print (df[a])
return a.sum()
df['new'] = df.apply(f, axis=1)
print (df)
A B C new
0 2018-03-27 u1 Yes 1
1 2018-03-28 u2 Yes 1
2 2018-02-04 u2 No 0
3 2018-04-04 u1 Yes 1
4 2018-05-04 u1 No 0
5 2018-10-04 u2 Yes 1
6 2018-11-04 u1 No 0
7 2018-04-16 u2 Yes 1
8 2018-05-13 u2 No 0
EDIT:
week = df.A.dt.to_period('W')
df['start'] = week.apply(lambda r: r.start_time) - pd.Timedelta(7, unit='d')
df['end'] = week.apply(lambda r: r.end_time) + pd.Timedelta(1, unit='ns') - pd.Timedelta(8, unit='d')
def f(x):
return (df.A.between(x.start, x.end) & (df.B == x.B) & (df.C == 'Yes')).sum()
df['new'] = df.apply(f, axis=1)
print (df)
A B C start end new
0 2018-03-27 u1 Yes 2018-03-19 2018-03-25 0
1 2018-03-28 u2 Yes 2018-03-19 2018-03-25 0
2 2018-02-04 u2 No 2018-01-22 2018-01-28 0
3 2018-04-04 u1 Yes 2018-03-26 2018-04-01 1
4 2018-05-04 u1 No 2018-04-23 2018-04-29 0
5 2018-10-04 u2 Yes 2018-09-24 2018-09-30 0
6 2018-11-04 u1 No 2018-10-22 2018-10-28 0
7 2018-04-16 u2 Yes 2018-04-09 2018-04-15 0
8 2018-05-13 u2 No 2018-04-30 2018-05-06 0
Upvotes: 1