konrac
konrac

Reputation: 9

Calculate grouped average of columns in pandas dataframe for non zero elements based on time

Hello I would like to calculate the average of days column for each Name group at each Time. The average at each Time for each group should be calculated only based on the elements of the rows of the Days which are larger than 0. Any feedback would be highly appreciated ..

Name       Time        Days    Average
 
John    2021-12-02      0        0
John    2021-12-03      2        0 
John    2021-12-05      9        2
John    2021-12-07      0       5.5
John    2021-12-10     10       5.5
Larry   2021-12-02     20        0
Jim     2021-12-09     20        0
Jim     2021-12-10     20       20
Jim     2021-12-12     40       20
Jim     2021-12-12      0       26.6
Juli    2021-11-09      0        0
Juli    2021-11-10      0        0
Juli    2021-11-12     40        0
Juli    2021-11-18      0       40
Juli    2021-11-12      0       40
Juli    2021-11-18      2       40
Juli    2021-11-19      0       21

Upvotes: 1

Views: 74

Answers (1)

jezrael
jezrael

Reputation: 862481

First replace 0 to missing values and then use GroupBy.transform with lambda function for Series.expanding with mean and Series.shift, last replace NaNs to 0 by Series.fillna:

df['Avg'] = (df.assign(Days = df['Days'].replace(0,np.nan))
               .groupby('Name')['Days']
               .transform(lambda x: x.expanding().mean().shift())
               .fillna(0))
print (df)
     Name        Time  Days  Average        Avg
0    John  2021-12-02     0      0.0   0.000000
1    John  2021-12-03     2      0.0   0.000000
2    John  2021-12-05     9      2.0   2.000000
3    John  2021-12-07     0      5.5   5.500000
4    John  2021-12-10    10      5.5   5.500000
5   Larry  2021-12-02    20      0.0   0.000000
6     Jim  2021-12-09    20      0.0   0.000000
7     Jim  2021-12-10    20     20.0  20.000000
8     Jim  2021-12-12    40     20.0  20.000000
9     Jim  2021-12-12     0     26.6  26.666667
10   Juli  2021-11-09     0      0.0   0.000000
11   Juli  2021-11-10     0      0.0   0.000000
12   Juli  2021-11-12    40      0.0   0.000000
13   Juli  2021-11-18     0     40.0  40.000000
14   Juli  2021-11-12     0     40.0  40.000000
15   Juli  2021-11-18     2     40.0  40.000000
16   Juli  2021-11-19     0     21.0  21.000000

Upvotes: 1

Related Questions