Allocate the first row of a group in Pandas

Question

I want to allocate the first row of a group.

The input:

df = pd.DataFrame({'col1': ['A', 'A', 'B', 'B'],
                   'col2': [1, 1, 2, 3],
                   'col3': ['value1', 'value2', 'value3', 'value4']})

I tried:

df.groupby(['col1', 'col2']).first()

But I only get the first row back.

I want this output:

col1 col2 col3    first_row
A    1    value1  True
A    1    value2  False
B    2    value3  True
B    3    Value4  True

Chris Adams · Accepted Answer

Use groupby.cumcount and eq. If the cumulative count is equal to 0, then it's the first row:

df['first_row'] = df.groupby(['col1', 'col2']).cumcount().eq(0)

[out]

  col1  col2    col3  first_row
0    A     1  value1       True
1    A     1  value2      False
2    B     2  value3       True
3    B     3  value4       True

Allocate the first row of a group in Pandas

Answers (2)

Related Questions