Reputation: 25
How to fill in missing data with the average of a column when row value is 0, and row value is 1 separately. What I have tried,
sample = sample.fillna(sample.loc[sample['val'] == 1].mean())
What I want to do is calculate the NaN When val is 1 separately and Val 2 separately. Something like this,
sample = Fillna(sample.mean() If row is 1) & Fillna(sample.mean() If row is 0 )
Upvotes: 1
Views: 338
Reputation: 323266
using groupby
with apply
, when you only have 0,1 in row .
sample=sample.groupby('val').apply(lambda x : x.fillna(x.mean())).reset_index(level=0,drop=True).sort_index()
And also better not using lambda
here do transform
whole df , and fillna
with datadframe
sample=sample.fillna(sample.groupby('val').transform('mean'))
Upvotes: 2
Reputation: 9941
We can groupby
the val
column and then fillna
missing values with mean
values inside the group. Using transform
here to keep the row order:
df = pd.DataFrame({'par1': [32,43,54,23],
'par2': [24,43,np.nan,64],
'par3': [84,np.nan,73,98],
'val': [0,1,0,1]})
x = df.groupby('val').transform(lambda x: x.fillna(x.mean())).join(df['val'])
print(x)
Output:
par1 par2 par3 val
0 32 24.0 84.0 0
1 43 43.0 98.0 1
2 54 24.0 73.0 0
3 23 64.0 98.0 1
Upvotes: 2