How to conditionally replace the first row of each group with the second row

Question

I want to filter all qualified boss in the dataset. The rule is, each group has one boss, if the status of the boss is "Active" then its valid, if the statue of boss is "Inactive" then we need to replace it using the second row of that group.

For example, in row 1 group A, Allen is Boss but Inactive, then the second row of group A need to replace the first row. In group B, Kim is Boss and Active, so its valid.

Group	Name	Title	Status
A	Allen	Boss	Inactive
A	Lisa	n	Active
A	Louis	n	Active
B	Kim	Boss	Active
B	Derek	n	Active
B	Andrew	n	Active
B	Katie	n	Active
C	Lux	Boss	Inactive

In the end, all inactive Boss row should be filtered out, end result should be like below.

Group	Name	Title	Status
A	Lisa	n	Active
B	Kim	Boss	Active

Michael S. · Accepted Answer

How's something like this?

df[df.Status.eq("Active")].drop_duplicates(subset=["Group"], ignore_index=True)

Output:

    Group   Name    Title       Status
0   A       Lisa    n           Active
1   B       Kim     Boss        Active

Stepping through it:

df[df.Status.eq("Active")] grabs only the rows where "Status" is "Active"
drop_duplicates(subset=["Group"] drops all rows after the first occurrence of a new value in "Group" ... e.g returns the first row with group A, then the first row with Group B, etc
ignore_index=True) ignores the above rows' index and resets the index to start back at 0. Without this the index would be [1, 3]

How to conditionally replace the first row of each group with the second row

Answers (2)

Related Questions