satoshi
satoshi

Reputation: 439

Pandas not selecting columns based on condition

I have the following dataframe

id   device
814  desktop
814  mobile
236  mobile
498  desktop
348  mobile
287  mobile
287  desktop

I want to query the id's where device is both mobile & desktop

df.loc[(df['device'] == 'mobile') & (df['device'] == 'desktop'), ['id','device']]

The output should be

id  device
814 desktop
814 mobile
287 desktop
287 mobile

Instead I just get an empty dataframe

id device

Upvotes: 2

Views: 47

Answers (2)

rafaelc
rafaelc

Reputation: 59274

You can also use set intersection

cond1 = df[df.device == 'desktop'].id.values
cond2 = df[df.device == 'mobile'].id.values
set(cond1) & set(cond2)

gives

{287, 814}

Then

df.loc[df.id.isin(x)]

    id  device
0   814 desktop
1   814 mobile
5   287 mobile
6   287 desktop

Upvotes: 2

jpp
jpp

Reputation: 164773

The & operator is used to specify multiple conditions, each of which must be True. Therefore, the mask (s == 'mobile') & (s == 'desktop') for a given series s will always return an empty dataframe.

Instead, you can group device to set for each id. Then use set.issubset, or equivalent syntactic sugar <=:

g = df.groupby('id')['device'].apply(set)
search = {'mobile', 'desktop'}

res = df[search <= df['id'].map(g)]

print(res)

    id   device
0  814  desktop
1  814   mobile
5  287   mobile
6  287  desktop

Upvotes: 3

Related Questions