Will
Will

Reputation: 25

How to fill in average of a column given condition in row

How to fill in missing data with the average of a column when row value is 0, and row value is 1 separately. What I have tried,

sample = sample.fillna(sample.loc[sample['val'] == 1].mean())

What I want to do is calculate the NaN When val is 1 separately and Val 2 separately. Something like this,

sample = Fillna(sample.mean() If row is 1) & Fillna(sample.mean() If row is 0 )

Preview DataSet

Upvotes: 1

Views: 338

Answers (2)

BENY
BENY

Reputation: 323266

using groupby with apply, when you only have 0,1 in row .

sample=sample.groupby('val').apply(lambda x : x.fillna(x.mean())).reset_index(level=0,drop=True).sort_index()

And also better not using lambda here do transform whole df , and fillna with datadframe

sample=sample.fillna(sample.groupby('val').transform('mean'))

Upvotes: 2

perl
perl

Reputation: 9941

We can groupby the val column and then fillna missing values with mean values inside the group. Using transform here to keep the row order:

df = pd.DataFrame({'par1': [32,43,54,23],
                   'par2': [24,43,np.nan,64],
                   'par3': [84,np.nan,73,98],
                   'val': [0,1,0,1]})

x = df.groupby('val').transform(lambda x: x.fillna(x.mean())).join(df['val'])

print(x)

Output:

   par1  par2  par3  val
0    32  24.0  84.0    0
1    43  43.0  98.0    1
2    54  24.0  73.0    0
3    23  64.0  98.0    1

Upvotes: 2

Related Questions