SalmaFG
SalmaFG

Reputation: 2202

Drop all rows that meet condition except the first one

I would like to drop all rows in a pandas dataframe that meet a certain condition except the first one. Note that the rows are not identical, so I cannot use drop_duplicates().

For example, if I have the dataframe:

Type    Count
A           4
X          33
X           5
E          51
Y           7

and I want to filter on condition: df[df.Type.isin(['X', 'Y'])] would remove all rows where the type is X or Y resulting in:

Type    Count
A           4
E          51

but I want to keep the first occurrence that satisfies the condition such that the result is:

Type    Count
A           4
X          33
E          51

Any suggestions would be appreciated.

Upvotes: 0

Views: 1221

Answers (2)

Georgina Skibinski
Georgina Skibinski

Reputation: 13387

Try:

import numpy as np

df['k']=np.where(df['Type'].isin(['X', 'Y']), 'x', np.arange(len(df)))

res=df.drop_duplicates(subset='k').drop('k', axis=1)

Outputs:

>>> res

  Type Count
0    A     4
1    X    33
3    E    51

Upvotes: 0

Erfan
Erfan

Reputation: 42896

We can set the first value to False with idxmax:

m = df.Type.isin(['X', 'Y'])
m.loc[m.idxmax()] = False
df[~m]

  Type  Count
0    A      4
1    X     33
3    E     51

Upvotes: 1

Related Questions