Reputation: 199
I have a pandas data frame in the following format:
col1 col2 col3 col4 col5
A A B Z X
B A Z Z X
A A C Z X
C A C D X
D A B D X
How can I filter the rows where the value in col1
is in col2
, col3
or col4
(disregarding col5
)?
I tried among other things:
df = df[df[['col2', 'col3', 'col4']].isin(['col1'])]
but get an empty data frame.
The expected output would be:
col1 col2 col3 col4 col5
A A B Z X
A A C Z X
C A C D X
D A B D X
Upvotes: 3
Views: 1889
Reputation: 323386
Fix your code by adding any
df = df[df[['col2', 'col3', 'col4']].isin(df['col1']).any(1)]
df
Out[135]:
col1 col2 col3 col4 col5
0 A A B Z X
2 A A C Z X
3 C A C D X
4 D A B D X
Upvotes: 3
Reputation: 59579
Check if any
of the values are equal (eq
) the value in column 1. DataFrame.eq
supports an axis argument.
m = df[['col2', 'col3', 'col4']].eq(df['col1'], axis=0).any(1)
df[m]
col1 col2 col3 col4 col5
0 A A B Z X
2 A A C Z X
3 C A C D X
4 D A B D X
Upvotes: 4
Reputation: 35686
Broadcasting is an option to get equality row-wise then check if any True
on axis 1 with any
:
import pandas as pd
df = pd.DataFrame({
'col1': {0: 'A', 1: 'B', 2: 'A', 3: 'C', 4: 'D'},
'col2': {0: 'A', 1: 'A', 2: 'A', 3: 'A', 4: 'A'},
'col3': {0: 'B', 1: 'Z', 2: 'C', 3: 'C', 4: 'B'},
'col4': {0: 'Z', 1: 'Z', 2: 'Z', 3: 'D', 4: 'D'},
'col5': {0: 'X', 1: 'X', 2: 'X', 3: 'X', 4: 'X'}
})
m = (df['col1'].values[:, None] == df[['col2', 'col3', 'col4']].values).any(1)
filtered = df[m]
print(filtered)
filtered
:
col1 col2 col3 col4 col5
0 A A B Z X
2 A A C Z X
3 C A C D X
4 D A B D X
Upvotes: 4