Qianru Song
Qianru Song

Reputation: 331

Extract the dataframe by multiple conditions in Python

I have a simple dataset, but I need to extract a sub-dateset under multiple conditions (by order):

df = pd.DataFrame({'animal': ['cat','cat','cat','dog','bird','bird'], 'place': ['A','B','C','A','B','C',]})

enter image description here

  1. cat or dog has to be located at least two places, if not, delete the rows where cat or dog appears once.

The output:

enter image description here

  1. cat or dog has to be in A place, if not, delete the rows. For example, if cat only stays in B or C, delete all rows of cat, but if cat stays A, and (B or C) which means A,B, A,C, or A,B,C, keep all cat rows.

The final output:

enter image description here

I am wondering if there is an efficient way to deal with. Thank you so much.

Upvotes: 0

Views: 317

Answers (1)

Kenan
Kenan

Reputation: 14094

Logic for first condition

logic1 = df['animal'].value_counts().loc[['cat', 'dog']] > 2

apply it to the df

df = df[df['animal'].map(logic1).fillna(True)]

This is one approach for logic2

logic2cat = d1[d1['animal'].eq('cat') & d1['place'].eq('A')].empty
logic2dog = d1[d1['animal'].eq('dog') & d1['place'].eq('A')].empty


if logic2cat:
    df = df[df['animal'].ne('cat')]
elif logic2dog:
    df = df[df['animal'].ne('dog')]

Upvotes: 2

Related Questions