Reputation: 1470
I've been trying to figure out how I can return just the first group, after I apply groupby.
My code looks like this:
gb = df.groupby(['col1', 'col2', 'col3', 'col4'])['col5'].sum()
What I want is for that first first group to output. I've been trying the get_group method but it keeps failing (maybe because I am grouping by multiple columns?)
Here is an example of my output:
col1 col2 col3 col4 'sum'
1 34 green 10 0.0
yellow 30 1.5
orange 20 1.1
2 89 green 10 3.0
yellow 5 0.0
orange 10 1.0
What I want to be returned is just this:
col1 col2 col3 col4 'sum'
1 34 green 10 0.0
yellow 30 1.5
orange 20 1.1
(Note the 'sum' column I just added here to make it clear what that last column was, but pandas does not actually name that column)
Upvotes: 17
Views: 26748
Reputation: 862641
I believe you need:
idx = df.index.get_level_values(0)
df = df[idx == idx[0]]
Or DataFrame.xs
:
df = df.xs(df.index.levels[0][0])
print (df)
'sum'
col1 col2 col3 col4
1 34 green 10 0.0
yellow 30 1.5
orange 20 1.1
Upvotes: 4
Reputation: 150
for group_id, group_df in df.groupby(['col1', 'col2', 'col3', 'col4']):
break
iterate over your groupby object and stop after the first iteration. The variables group_id and group_df will contain your first group.
Kind of an ugly workaround but works.
Upvotes: 3
Reputation: 323226
You can using get_group
with groups
g=df.groupby(['col1','col2'])
g.get_group((list(g.groups)[0])).groupby(['col3','col4'])['col5'].sum()
Upvotes: 21
Reputation: 294258
gb = df.groupby(['col1', 'col2', 'col3', 'col4'])['col5'].sum()
gb.loc[[gb.index.levels[0][0]]])
Upvotes: 5