pandas: replacing null values with average by group

Question

I am trying to replace null values in a column with the average according to the group from another column. I have tried this code and the null values are replaced but not correctly. How so? How should I correct this?

the first two null values should be replaced by a 3 since they belong to group 'A' where the average is 3. the following null value should be 4 since is located in group B that has 4,2,1,5 averaging 3.

column 'z' should have the following values: 3 3 3 5 3 1 2 4 6 9 10 5

xx=float('nan')
data=[['A', 1, xx ],
        ['B', 5,5],
        ['C', 4,6]
        ,['A', 6,xx],
        ['B',9,xx],
        ['C', 7,9]
        ,['A', 2,3],
        ['B', 5,1],
        ['C',2,10]
        ,['B', 8,2],
        ['B', 5,4],
        ['C', 8,5 ]]
dff = pd.DataFrame(data, columns=['x','y','z'])

dff = dff.sort_values(by =['x'], ascending=True)
dff.reset_index(drop=True, inplace= True)
print(dff)

dff['z'] = df.groupby(['x'])['z'].transform(lambda x: x.fillna(x.mean()))
print(dff)

pandas: replacing null values with average by group

Answers (1)

Related Questions