TYL
TYL

Reputation: 1637

Searching for string in all columns of dataframe in Python

I am trying to find a string in all columns of a dataframe.

import pandas as pd
df = pd.DataFrame([['a', 'b'], ['c', 'd'], ['e', 'a']], columns=["A", "B"])

for col in df:
        df[col].str.contains('a')

0     True
1    False
2    False
Name: A, dtype: bool
0    False
1    False
2     True
Name: B, dtype: bool

However, the code above returns only the booleans and not the format that I want (displays the rows and columns in table form), which can be achieved when searching in a specific column:

df[df.A == 'a']

   A  B
0  a  b

Can anyone help?

Upvotes: 2

Views: 8671

Answers (2)

jezrael
jezrael

Reputation: 862601

Create boolean DataFrame and check at least one True per row by DataFrame.any and filter by boolean indexing:

df = df[df.eq('a').any(axis=1)]
print (df)
   A  B
0  a  b
2  e  a

Detail:

print (df.eq('a'))
       A      B
0   True  False
1  False  False
2  False   True

print(df.eq('a').any(axis=1))
0     True
1    False
2     True
dtype: bool

If want check substrings use str.contains for boolean DataFrame:

df = pd.DataFrame([['ad', 'b'], ['c', 'd'], ['e', 'asw']], columns=["A", "B"])
print (df)
    A    B
0  ad    b
1   c    d
2   e  asw

df = df[df.apply(lambda x: x.str.contains('a')).any(axis=1)]

Or applymap for elemnt wise checking by in:

df = df[df.applymap(lambda x: 'a' in x).any(axis=1)]

print (df)
    A    B
0  ad    b
2   e  asw

Upvotes: 7

jpp
jpp

Reputation: 164663

This is one way using the aptly named np.logical_or.

import pandas as pd, numpy as np

df = pd.DataFrame([['a', 'b'], ['c', 'd'], ['e', 'a']], columns=["A", "B"])

mask = np.logical_or.reduce([df[col] == 'a' for col in df])

df[mask]

#    A  B
# 0  a  b
# 2  e  a

This method is also adaptable to str.contains. Fore example:

mask = np.logical_or.reduce([df[col].str.contains('a', na=False) for col in df])

Upvotes: 1

Related Questions