Java
Java

Reputation: 1292

Pandas select rows that contains both values (inclusive)

I am trying to select only rows that have both values.

For example, I am trying to select only a Patient who has both Right and Left on Ear column.

In this case, it would be only Lisa.

Here is code that I was trying to get the right data.

import pandas as pd

data = {'name': ['Lisa', 'Lisa', 'Mac', 'Intosh'],
    'ear': ['Right','Left','Right','Left']
   }

df = pd.DataFrame(data, columns = ['name', 'ear'])
df.loc[(df['name'] == 'Right') & (df['ear'] == 'Left')]
df

enter image description here

The code retrives everyone, but I am trying to only get row 0 and 1 because Lisa has both Right and Left for ear.

Upvotes: 3

Views: 695

Answers (2)

BENY
BENY

Reputation: 323316

I am using filter

df.groupby('name').filter(lambda x : pd.Series(['Right','Left']).isin(x['ear']).all())
Out[106]: 
   name    ear
0  Lisa  Right
1  Lisa   Left

Or issubset

df.groupby('name').filter(lambda x : {'Right','Left'}.issubset(x['ear'].tolist()))

To fix your code need isin (get intersection of two series)

s1=df.loc[df.ear=='Right','name']
s2=df.loc[df.ear=='Left','name']
df.loc[df.name.isin(s1[s1.isin(s2)]),]
Out[119]: 
   name    ear
0  Lisa  Right
1  Lisa   Left

Upvotes: 3

cs95
cs95

Reputation: 402813

Your solution actually requires the use of groupby and nunique:

df[df.groupby('name')['ear'].transform('nunique') == df['ear'].nunique()]

   name    ear
0  Lisa  Right
1  Lisa   Left

You can simplify this assuming humans can only have two ears ;)

df[df.groupby('name')['ear'].transform('nunique').eq(2)]

   name    ear
0  Lisa  Right
1  Lisa   Left

Details
groupby will compute the number of (unique) ear entries with respect to the name:

df.groupby('name')['ear'].transform('nunique')

0    2
1    2
2    1
3    1
Name: ear, dtype: int64

(The first two rows belong to Lisa.) The result is broadcasted to the original frame.

Then check what rows have a unique count of two and select accordingly.

Upvotes: 4

Related Questions