Reputation: 8844
I have a long list of columns for which I want to calculate non-zero median,mean & std in a one go. I cannot just delete rows with 0 based on 1 column because the value for another column in same column may not be 0.
Below is the code I currently have which calculates median,mean etc. including zero.
agg_list_oper={'ABC1':[max,np.std,np.mean,np.median],
'ABC2':[max,np.std,np.mean,np.median],
'ABC3':[max,np.std,np.mean,np.median],
'ABC4':[max,np.std,np.mean,np.median],
.....
.....
.....
}
df=df_tmp.groupby(['id']).agg(agg_list_oper).reset_index()
I know I can write long code with loops to process one column at a time. Is there a way to do this in pandas groupby.agg() or some other functions elegantly?
Upvotes: 0
Views: 4145
Reputation:
You can temporarily replace 0's with NaNs. Then, pandas will ignore the NaNs while calculating medians.
df_tmp.replace(0, np.nan).groupby(['id']).agg(agg_list_oper).reset_index()
Upvotes: 4