Reputation: 763
Using groupby().agg() allows to calculate summary statistics for specifically named columns. However, what if I want to calculate „min“, „max“ and „mean“ for every column of the data frame per group. Is there a way such that pandas will append a prefix to each column name automatically? I do not want to enumerate each basic column name within the agg() function.
Upvotes: 0
Views: 1432
Reputation: 3720
You could get there using describe():
df1 = pd.DataFrame(df.describe().unstack())
n_label = pd.Series(['_'.join(map(str,i)) for i in df1.index.tolist()])
df1 = df1.reset_index(drop=True)
df1['label'] = n_label
print(df1[df1['label'].str.contains('_m')].reset_index(drop=True))
0 label
0 4.0105 col1_mean
1 0.0000 col1_min
2 12.0000 col1_max
3 3.9639 col2_mean
4 0.0000 col2_min
5 12.0000 col2_max
6 4.0256 col3_mean
7 0.0000 col3_min
8 12.0000 col3_max
Upvotes: 0
Reputation: 1179
You can iterate through every column, then create the prefixes etc. using the original column name as a starting point. If you use .agg and do min and max on the same column, you only get the last operation as far as I can tell, though maybe there is a way to do that. So in this example, I do one operation at a time. Here's one way to do what you want, assuming there is a certain column 'col1' that you will use to use to line up all the groupby data.:
df = pd.DataFrame({'col1': ['A', 'A', 'B', 'B'], 'col2': [1, 2, 3, 4], 'col3': [5, 6, 7, 8]})
col_list = df.columns.tolist()
col_list.remove('col1') # the column you will use for the groupby output
dfg_all = df[['col1']].drop_duplicates()
for col in col_list:
for op in ['min', 'max', 'mean']:
if op == 'min':
dfg = df.groupby('col1', as_index=False)[col].min()
elif op == 'max':
dfg = df.groupby('col1', as_index=False)[col].max()
else:
dfg = df.groupby('col1', as_index=False)[col].mean()
dfg = dfg.rename(columns={col:col+'_'+ op})
dfg_all = dfg_all.merge(dfg, on='col1', how='left')
to get
col1 col2_min col2_max col2_mean col3_min col3_max col3_mean
0 A 1 2 1.5 5 6 5.5
1 B 3 4 3.5 7 8 7.5
Upvotes: 0