Reputation: 1637
I am trying to find a string in all columns of a dataframe.
import pandas as pd
df = pd.DataFrame([['a', 'b'], ['c', 'd'], ['e', 'a']], columns=["A", "B"])
for col in df:
df[col].str.contains('a')
0 True
1 False
2 False
Name: A, dtype: bool
0 False
1 False
2 True
Name: B, dtype: bool
However, the code above returns only the booleans and not the format that I want (displays the rows and columns in table form), which can be achieved when searching in a specific column:
df[df.A == 'a']
A B
0 a b
Can anyone help?
Upvotes: 2
Views: 8671
Reputation: 862601
Create boolean DataFrame
and check at least one True
per row by DataFrame.any
and filter by boolean indexing
:
df = df[df.eq('a').any(axis=1)]
print (df)
A B
0 a b
2 e a
Detail:
print (df.eq('a'))
A B
0 True False
1 False False
2 False True
print(df.eq('a').any(axis=1))
0 True
1 False
2 True
dtype: bool
If want check substring
s use str.contains
for boolean DataFrame
:
df = pd.DataFrame([['ad', 'b'], ['c', 'd'], ['e', 'asw']], columns=["A", "B"])
print (df)
A B
0 ad b
1 c d
2 e asw
df = df[df.apply(lambda x: x.str.contains('a')).any(axis=1)]
Or applymap
for elemnt wise checking by in
:
df = df[df.applymap(lambda x: 'a' in x).any(axis=1)]
print (df)
A B
0 ad b
2 e asw
Upvotes: 7
Reputation: 164663
This is one way using the aptly named np.logical_or
.
import pandas as pd, numpy as np
df = pd.DataFrame([['a', 'b'], ['c', 'd'], ['e', 'a']], columns=["A", "B"])
mask = np.logical_or.reduce([df[col] == 'a' for col in df])
df[mask]
# A B
# 0 a b
# 2 e a
This method is also adaptable to str.contains
. Fore example:
mask = np.logical_or.reduce([df[col].str.contains('a', na=False) for col in df])
Upvotes: 1