Pandas get integer keys after groupby

Question

I have a groupby with multiple columns and the keys contain all columns which makes the output hard to read... Here's an example

import pandas as pd
import numpy as np
from pandas import Series

df = pd.DataFrame({'A': [1, 1, 2, 2],
                   'B': [1, 2, 2, 2],
                   'C': np.random.randn(4),
                   'D': ['one', 'two', 'three', 'four']})

def aggregate(x):
    return Series(dict(C=round(x['C'].mean()), D=' '.join(x['D'])))

print(df.groupby(['A', 'B']).apply(aggregate))

       C           D
A B                 
1 1  0.0         one
  2 -1.0         two
2 2 -0.0  three four

How can I get 'normal' keys? Like

   C           D
0  0.0         one
1 -1.0         two
2 -0.0  three four

jezrael · Accepted Answer

For better performance is better use DataFrameGroupBy.agg by dictionary, last add reset_index with drop=True for remove MultiIndex:

aggregate = {'C':lambda x: round(x.mean()), 'D':' '.join}
print(df.groupby(['A', 'B']).agg(aggregate).reset_index(drop=True))
     C           D
0  0.0         one
1  0.0         two
2  1.0  three four

If want MultiIndex convert to columns there are 2 ways:

print(df.groupby(['A', 'B'], as_index=False).agg(aggregate))

Or:

print(df.groupby(['A', 'B']).agg(aggregate).reset_index())

   A  B    C           D
0  1  1  0.0         one
1  1  2 -1.0         two
2  2  2 -1.0  three four

Pandas get integer keys after groupby

Answers (2)

Related Questions