Reputation: 619
I am trying some ifthen logic but I am working in a dataframe and couldn't find any examples.
What I am trying to do is filter this dataset to only include values where col1=col3 and col2=col4
col1 col2 col3 col4
Wagner John Wagner John
Jane Mary Klein Peter
Schneider Megan Wicker Sam
Schneider Megan Schneider Megan
result
col1 col2 col3 col4
Wagner John Wagner John
Schneider Megan Schneider Megan
My code here doesn't work
df1.apply(lambda x : x['col1'] if x['col1'] == x['col1'] and x['col2'] == x['col2'] else "", axis=1
Upvotes: 2
Views: 2723
Reputation: 294508
messing around with numpy
and assuming the columns are the specific order they are
df[np.equal(*df.values.T.reshape(2, 2, -1)).all(0)]
col1 col2 col3 col4
0 Wagner John Wagner John
3 Schneider Megan Schneider Megan
If columns were in another order
cols = ['col1', 'col2', 'col3', 'col4']
v = np.column_stack([df[c].values for c in cols])
df[np.equal(*v.T.reshape(2, 2, -1)).all(0)]
Upvotes: 1
Reputation: 210942
I'd use DataFrame.query() method:
In [205]: df.query("col1==col3 and col2==col4")
Out[205]:
col1 col2 col3 col4
0 Wagner John Wagner John
3 Schneider Megan Schneider Megan
or a "classical" approach:
In [206]: df.loc[(df.col1==df.col3) & (df.col2==df.col4)]
Out[206]:
col1 col2 col3 col4
0 Wagner John Wagner John
3 Schneider Megan Schneider Megan
Upvotes: 4