Reputation: 383
I want to replace column value of dataframe with mean(without zeros) value of column grouped by another column.
Dataframe df is like:
ID | TYPE | rate
-------------
1 | A | 0 <- Replace this
2 | B | 2
3 | C | 1
4 | A | 2
5 | C | 1
6 | C | 0 <- Replace this
7 | C | 8
8 | C | 2
9 | D | 0 <- Replace this
I have to replace values in rating where rating = 0:
df['rate'][df['rate']==0] = ?
with average value for that TYPE.
Average(without zeros) value for every type is:
A = 2/1 = 2
B = 2/1 = 2
C = (1 + 1 + 8 + 2)/4 = 3
D = 0 (default value when there isn't information for type)
Expected result:
ID | TYPE | rate
-------------
1 | A | 2 <- Changed
2 | B | 2
3 | C | 1
4 | A | 2
5 | C | 1
6 | C | 3 <- Changed
7 | C | 8
8 | C | 2
9 | D | 0 <- Changed
Upvotes: 3
Views: 172
Reputation: 88236
You could mask
the rate
column in the dataframe, GroupBy
the TYPE
and transform
with the mean
, which will exlude NaNs
. The use fillna
to replace the values in the masked dataframe:
ma = df.rate.mask(df.rate.eq(0))
df['rate'] = ma.fillna(ma.groupby(df.TYPE).transform('mean').fillna(0))
ID TYPE rate
0 1 A 2.0
1 2 B 2.0
2 3 C 1.0
3 4 A 2.0
4 5 C 1.0
5 6 C 3.0
6 7 C 8.0
7 8 C 2.0
8 9 D 0.0
Upvotes: 3