csha
csha

Reputation: 59

select rows in a dataframe in python based on two criteria

Based on the dataframe (1) below, I wish to create a dataframe (2) where either y or z is equal to 2. Is there a way to do this conveniently?

And if I were to create a dataframe (3) that only contains rows from dataframe (1) but not dataframe (2), how should I approach it?

    id  x  y  z 
    0 324  1  2
    1 213  1  1
    2 529  2  1
    3 347  3  2
    4 109  2  2

...

Upvotes: 1

Views: 65

Answers (3)

seralouk
seralouk

Reputation: 33127

You can do the following:

import pandas as pd

df = pd.read_csv('data.csv')
df2 = df[(df.y == 2) | (df.z == 2)]

print(df2)

Results:

   id    x  y  z
0   0  324  1  2
2   2  529  2  1
3   3  347  3  2
4   4  109  2  2

Upvotes: 0

BENY
BENY

Reputation: 323226

df[df[['y','z']].eq(2).any(1)]
Out[1205]: 
   id    x  y  z
0   0  324  1  2
2   2  529  2  1
3   3  347  3  2
4   4  109  2  2

Upvotes: 2

cs95
cs95

Reputation: 402263

You can create df2 easily enough using a condition:

df2 = df1[df1.y.eq(2) | df1.z.eq(2)]

df2
      x  y  z
id           
0   324  1  2
2   529  2  1
3   347  3  2
4   109  2  2

Given df2 and df1, you can perform a set difference operation on the index, like this:

df3 = df1.iloc[df1.index.difference(df2.index)]

df3 
      x  y  z
id           
1   213  1  1

Upvotes: 1

Related Questions