Reputation: 59
Based on the dataframe (1) below, I wish to create a dataframe (2) where either y or z is equal to 2. Is there a way to do this conveniently?
And if I were to create a dataframe (3) that only contains rows from dataframe (1) but not dataframe (2), how should I approach it?
id x y z
0 324 1 2
1 213 1 1
2 529 2 1
3 347 3 2
4 109 2 2
...
Upvotes: 1
Views: 65
Reputation: 33127
You can do the following:
import pandas as pd
df = pd.read_csv('data.csv')
df2 = df[(df.y == 2) | (df.z == 2)]
print(df2)
Results:
id x y z
0 0 324 1 2
2 2 529 2 1
3 3 347 3 2
4 4 109 2 2
Upvotes: 0
Reputation: 323226
df[df[['y','z']].eq(2).any(1)]
Out[1205]:
id x y z
0 0 324 1 2
2 2 529 2 1
3 3 347 3 2
4 4 109 2 2
Upvotes: 2
Reputation: 402263
You can create df2
easily enough using a condition:
df2 = df1[df1.y.eq(2) | df1.z.eq(2)]
df2
x y z
id
0 324 1 2
2 529 2 1
3 347 3 2
4 109 2 2
Given df2
and df1
, you can perform a set difference operation on the index, like this:
df3 = df1.iloc[df1.index.difference(df2.index)]
df3
x y z
id
1 213 1 1
Upvotes: 1