Reputation: 21260
How I can write following function in more pandas way:
def calculate_df_columns_mean(self, df):
means = {}
for column in df.columns.columns.tolist():
cleaned_data = self.remove_outliers(df[column].tolist())
means[column] = np.mean(cleaned_data)
return means
Thanks for help.
Upvotes: 7
Views: 26896
Reputation: 394041
It seems to me that the iteration over the columns is unnecessary:
def calculate_df_columns_mean(self, df):
cleaned_data = self.remove_outliers(df[column].tolist())
return cleaned_data.mean()
the above should be enough assuming that remove_outliers
still returns a df
EDIT
I think the following should work:
def calculate_df_columns_mean(self, df):
return df.apply(lambda x: remove_outliers(x.tolist()).mean()
Upvotes: 4
Reputation: 9866
Use dataFrame.apply(func, axis=0)
:
# axis=0 means apply to columns; axis=1 to rows
df.apply(numpy.sum, axis=0) # equiv to df.sum(0)
Upvotes: 5