Reputation: 331
I have a simple dataset, but I need to extract a sub-dateset under multiple conditions (by order):
df = pd.DataFrame({'animal': ['cat','cat','cat','dog','bird','bird'], 'place': ['A','B','C','A','B','C',]})
The output:
The final output:
I am wondering if there is an efficient way to deal with. Thank you so much.
Upvotes: 0
Views: 317
Reputation: 14094
Logic for first condition
logic1 = df['animal'].value_counts().loc[['cat', 'dog']] > 2
apply it to the df
df = df[df['animal'].map(logic1).fillna(True)]
This is one approach for logic2
logic2cat = d1[d1['animal'].eq('cat') & d1['place'].eq('A')].empty
logic2dog = d1[d1['animal'].eq('dog') & d1['place'].eq('A')].empty
if logic2cat:
df = df[df['animal'].ne('cat')]
elif logic2dog:
df = df[df['animal'].ne('dog')]
Upvotes: 2