Reputation: 105

pandas fill null values by the mean of that category

I have this dataset. You can see there's a null Value for the Product Wheat. I want to fill this null Value by the mean of the category of that product. So for wheat this is category A which have mean 5+8/2=6.5.

  Product     Value Category
0   Rice        5      A
1   Corn        8      A
2   Milk       17      B
3   Wheat      NaN     A
4   Ice cream   3      B

Here's what I have tried.

df[Value].fillna(df.groupby('Category')[Value].mean(),inplace=True)

But this is not working. As it returns means of each category. So how can I achieve this? Thanks for your help.

Upvotes: 0

Answers (4)

rhug123

Reputation: 8768

Try:

df['Value'] = df.groupby('Category')['Value'].transform(lambda x: x.fillna(x.mean()))

Upvotes: 2

Naga Kalyan

Reputation: 27

data1 = data.groupby('Category')['Value'].mean().reset_index()
data1.columns = ['Category','Mean']
data = data.join(data1.set_index('Category'), on='Category')
data['Value'] = data['Value'].fillna(data['Mean'])
data=data.drop('Mean',axis=1)
data

output

Upvotes: 0

Md.Rezaul Karim

Reputation: 46

You can use GroupBy + transform to fill NaN values with groupwise means.

df['value'] = df['value'].fillna(df.groupby('Category')['value'].transform('mean'))
df['value'] = df['value'].fillna(df['value'].mean())
df

Upvotes: 1

Biranchi

Reputation: 16327

Try this,

for index, row in df.iterrows():
  product = row['product']
  value = row['value']
  category = row['category']
  if np.isnan(value):
    category_mean = df.groupby('category')['value'].mean()[category]
    print(f'category_mean : {category_mean}')
    df.loc[index, 'value'] = category_mean
  else:
    print(product, value, category)

https://github.com/biranchi2018/My_ML_Examples/blob/master/16.Stackoverflow_Pandas.ipynb

Upvotes: 1

pandas fill null values by the mean of that category

Answers (4)

Related Questions