Brucie Alpha
Brucie Alpha

Reputation: 1216

Pandas get integer keys after groupby

I have a groupby with multiple columns and the keys contain all columns which makes the output hard to read... Here's an example

import pandas as pd
import numpy as np
from pandas import Series

df = pd.DataFrame({'A': [1, 1, 2, 2],
                   'B': [1, 2, 2, 2],
                   'C': np.random.randn(4),
                   'D': ['one', 'two', 'three', 'four']})

def aggregate(x):
    return Series(dict(C=round(x['C'].mean()), D=' '.join(x['D'])))

print(df.groupby(['A', 'B']).apply(aggregate))
       C           D
A B                 
1 1  0.0         one
  2 -1.0         two
2 2 -0.0  three four

How can I get 'normal' keys? Like

   C           D
0  0.0         one
1 -1.0         two
2 -0.0  three four

Upvotes: 1

Views: 627

Answers (2)

jezrael
jezrael

Reputation: 862761

For better performance is better use DataFrameGroupBy.agg by dictionary, last add reset_index with drop=True for remove MultiIndex:

aggregate = {'C':lambda x: round(x.mean()), 'D':' '.join}
print(df.groupby(['A', 'B']).agg(aggregate).reset_index(drop=True))
     C           D
0  0.0         one
1  0.0         two
2  1.0  three four

If want MultiIndex convert to columns there are 2 ways:

print(df.groupby(['A', 'B'], as_index=False).agg(aggregate))

Or:

print(df.groupby(['A', 'B']).agg(aggregate).reset_index())
   A  B    C           D
0  1  1  0.0         one
1  1  2 -1.0         two
2  2  2 -1.0  three four

Upvotes: 1

jpp
jpp

Reputation: 164693

You can use reset_index and specify the optional parameter drop=True. Note this removes your grouping key index entirely.

print(df.groupby(['A', 'B']).apply(aggregate).reset_index(drop=True))

   C           D
0  0         one
1 -1         two
2  0  three four

Upvotes: 1

Related Questions