Conditions based in date periods and groups

Question

            A    B           C   D
0  2002-01-13  Dan  2002-01-15  10
1  2002-01-13  Dan  2002-01-25  24
2  2002-01-13  Vic  2002-01-17  14
3  2002-01-13  Vic  2002-01-03  18
4  2002-01-28  Mel  2002-02-08  37
5  2002-01-28  Mel  2002-02-06  29
6  2002-01-28  Mel  2002-02-10  20
7  2002-01-28  Rob  2002-02-12  30
8  2002-01-28  Rob  2002-02-01  47

I want to create a new df['E'] column, with next conditions, per each B group:

E=D value, where A dates are nearest to 10 days later than C date.
If there are two C dates at the same distance to 10 days from A (case of 2002-01-28 Mel), E will be the mean of these same-period D values.

Output should be:

            A    B           C   D   E
0  2002-01-13  Dan  2002-01-15  10  24
1  2002-01-13  Dan  2002-01-25  24  24
2  2002-01-13  Vic  2002-01-17  14  14
3  2002-01-13  Vic  2002-01-03  18  14
4  2002-01-28  Mel  2002-02-08  37  33
5  2002-01-28  Mel  2002-02-06  29  33
6  2002-01-28  Mel  2002-02-10  20  33
7  2002-01-28  Rob  2002-02-12  30  30 
8  2002-01-28  Rob  2002-02-01  47  30

BENY · Accepted Answer

Ok , seems you need

df['E']=abs((df.C-df.A).dt.days-10)# get the days different 
df['E']=df.B.map(df.loc[df.E==df.groupby('B').E.transform('min')].groupby('B').D.mean())# find the min value for the different , and get the mean 
df
Out[106]: 
           A    B          C   D   E
0 2002-01-13  Dan 2002-01-15  10  24
1 2002-01-13  Dan 2002-01-25  24  24
2 2002-01-13  Vic 2002-01-17  14  14
3 2002-01-13  Vic 2002-01-03  18  14
4 2002-01-28  Mel 2002-02-08  37  33
5 2002-01-28  Mel 2002-02-06  29  33
6 2002-01-28  Mel 2002-02-10  20  33
7 2002-01-28  Rob 2002-02-12  30  30
8 2002-01-28  Rob 2002-02-01  47  30

Conditions based in date periods and groups

Answers (1)

Related Questions