keynesiancross
keynesiancross

Reputation: 3529

Pandas - Drop rows where *not* totally duplicated

I have a DataFrame that contains the following columns (along with others). I am trying to figure out how to remove all rows where: For each group number in ID_Dets if there exists more than 1 unique number in ID_Dets_2 then drop all rows.

I have bolded the rows that I would want removed. Thx!

Index Other Columns ID_Dets ID_Dets_2
11 aga 4 5
13 af 2 3
14 da 3 4
15 aga 3 4
16 as 0 0
17 ga 0 0
18 asg 0 1
19 asa 0 1
4 ga 1 2
5 aah 1 2

Upvotes: 1

Views: 235

Answers (3)

G.G
G.G

Reputation: 765

def function1(dd:pd.DataFrame):
    return dd.assign(col1=dd.ID_Dets_2.nunique()<=1)

df1.groupby('ID_Dets').apply(function1).loc[lambda dd:dd.col1]

   Index Other Columns  ID_Dets  ID_Dets_2  col1
0     11           aga        4          5  True
1     13            af        2          3  True
2     14            da        3          4  True
3     15           aga        3          4  True
8      4            ga        1          2  True
9      5           aah        1          2  True

Upvotes: 0

scotscotmcc
scotscotmcc

Reputation: 3113

I think the answer by mozway is way slicker than mine, but I was working on it so might as well post it...

You can generate a list of ids to remove and then filter on that list.

ids_to_remove = []
for v in df['ID_Dets'].unique():
    if df[df['ID_Dets'] == v]["ID_Dets_2"].nunique() > 1: ids_to_remove.append(a)
df = df[~df['ID_Dets'].isin(ids_to_remove)]

Upvotes: 1

mozway
mozway

Reputation: 260430

You can count the number of unique values per group and set a threshold to have 1 unique value:

df[df.groupby('ID_Dets')['ID_Dets_2'].transform('nunique').eq(1)]

or:

df.groupby('ID_Dets').filter(lambda g: len(g['ID_Dets_2'].unique())<=1)

output:

   Index Other Columns  ID_Dets  ID_Dets_2
0     11           aga        4          5
1     13            af        2          3
2     14            da        3          4
3     15           aga        3          4
8      4            ga        1          2
9      5           aah        1          2

Upvotes: 4

Related Questions