Hana
Hana

Reputation: 1470

How to get the first group in a groupby of multiple columns?

I've been trying to figure out how I can return just the first group, after I apply groupby.

My code looks like this:

gb = df.groupby(['col1', 'col2', 'col3', 'col4'])['col5'].sum()

What I want is for that first first group to output. I've been trying the get_group method but it keeps failing (maybe because I am grouping by multiple columns?)

Here is an example of my output:

col1  col2  col3   col4  'sum'
 1     34   green   10    0.0
            yellow  30    1.5 
            orange  20    1.1 
 2     89   green   10    3.0 
            yellow   5    0.0 
            orange  10    1.0

What I want to be returned is just this:

col1  col2  col3   col4  'sum'
 1     34   green   10    0.0
            yellow  30    1.5 
            orange  20    1.1 

(Note the 'sum' column I just added here to make it clear what that last column was, but pandas does not actually name that column)

Upvotes: 17

Views: 26748

Answers (4)

jezrael
jezrael

Reputation: 862641

I believe you need:

idx = df.index.get_level_values(0)
df = df[idx == idx[0]] 

Or DataFrame.xs:

df = df.xs(df.index.levels[0][0])

print (df)
                       'sum'
col1 col2 col3   col4       
1    34   green  10      0.0
          yellow 30      1.5
          orange 20      1.1

Upvotes: 4

user2505961
user2505961

Reputation: 150

for group_id, group_df in df.groupby(['col1', 'col2', 'col3', 'col4']):
    break

iterate over your groupby object and stop after the first iteration. The variables group_id and group_df will contain your first group.

Kind of an ugly workaround but works.

Upvotes: 3

BENY
BENY

Reputation: 323226

You can using get_group with groups

g=df.groupby(['col1','col2'])

g.get_group((list(g.groups)[0])).groupby(['col3','col4'])['col5'].sum()

Upvotes: 21

piRSquared
piRSquared

Reputation: 294258

gb = df.groupby(['col1', 'col2', 'col3', 'col4'])['col5'].sum()

gb.loc[[gb.index.levels[0][0]]])

Upvotes: 5

Related Questions