Reputation: 647
A B C D
0 2002-01-13 Dan 2002-01-15 10
1 2002-01-13 Dan 2002-01-25 24
2 2002-01-13 Vic 2002-01-17 14
3 2002-01-13 Vic 2002-01-03 18
4 2002-01-28 Mel 2002-02-08 37
5 2002-01-28 Mel 2002-02-06 29
6 2002-01-28 Mel 2002-02-10 20
7 2002-01-28 Rob 2002-02-12 30
8 2002-01-28 Rob 2002-02-01 47
I want to create a new df['E']
column, with next conditions, per each B
group:
A
dates are nearest to 10 days later than C
date.C
dates at the same distance to 10 days from A
(case of 2002-01-28
Mel
), E
will be the mean of these same-period D
values.Output should be:
A B C D E
0 2002-01-13 Dan 2002-01-15 10 24
1 2002-01-13 Dan 2002-01-25 24 24
2 2002-01-13 Vic 2002-01-17 14 14
3 2002-01-13 Vic 2002-01-03 18 14
4 2002-01-28 Mel 2002-02-08 37 33
5 2002-01-28 Mel 2002-02-06 29 33
6 2002-01-28 Mel 2002-02-10 20 33
7 2002-01-28 Rob 2002-02-12 30 30
8 2002-01-28 Rob 2002-02-01 47 30
Upvotes: 0
Views: 23
Reputation: 323276
Ok , seems you need
df['E']=abs((df.C-df.A).dt.days-10)# get the days different
df['E']=df.B.map(df.loc[df.E==df.groupby('B').E.transform('min')].groupby('B').D.mean())# find the min value for the different , and get the mean
df
Out[106]:
A B C D E
0 2002-01-13 Dan 2002-01-15 10 24
1 2002-01-13 Dan 2002-01-25 24 24
2 2002-01-13 Vic 2002-01-17 14 14
3 2002-01-13 Vic 2002-01-03 18 14
4 2002-01-28 Mel 2002-02-08 37 33
5 2002-01-28 Mel 2002-02-06 29 33
6 2002-01-28 Mel 2002-02-10 20 33
7 2002-01-28 Rob 2002-02-12 30 30
8 2002-01-28 Rob 2002-02-01 47 30
Upvotes: 2