Reputation: 83
I have a dataframe with multiple values as zero. I want to replace the values that are zero with the mean values of that column Without repeating code. I have columns called runtime, budget, and revenue that all have zero and i want to replace those Zero values with the mean of that column.
Ihave tried to do it one column at a time like this:
print(df['budget'].mean())
-> 14624286.0643
df['budget'] = df['budget'].replace(0, 14624286.0643)
Is their a way to write a function to not have to write the code multiple time for each zero values for all columns?
Upvotes: 8
Views: 38762
Reputation: 89
How about iterating through all columns and replacing them?
for col in df.columns:
val = df[col].mean()
df[col] = df[col].replace(0, val)
Upvotes: 0
Reputation: 417
Same we can achieve directly using replace method. Without fillna
df.replace(0,df.mean(axis=0),inplace=True)
Method info: Replace values given in "to_replace" with "value".
Values of the DataFrame are replaced with other values dynamically. This differs from updating with .loc or .iloc which require you to specify a location to update with some value.
Upvotes: 15
Reputation: 323276
So this is pandas
dataframe I will using mask
make all 0 to np.nan
, then fillna
df=df.mask(df==0).fillna(df.mean())
Upvotes: 15