Reputation: 1698
I am currently following the instructions laid out here for finding values, and it works. The only problem is my dataframe is quite big (5x3500 rows) and I need to perform around ~2000 searches. Each one takes around 4 seconds, so obviously this adds up and has become a bit unsustainable on my end.
Most concise way to select rows where any column contains a string in Pandas dataframe?
Is there a faster way to search for all rows containing a string value than this?
df[df.apply(lambda r: r.str.contains('b', case=False).any(), axis=1)]
Upvotes: 6
Views: 2447
Reputation: 164713
One trivial possibility is to disable regex:
res = df[df.apply(lambda r: r.str.contains('b', case=False, regex=False).any(), axis=1)]
Another way using a list comprehension:
res = df[[any('b' in x.lower() for x in row) for row in df.values)]]
Upvotes: 2
Reputation: 323306
You can testing the speed
boolfilter=(np.char.find(df.values.ravel().astype(str),'b')!=-1).reshape(df.shape).any(1)
boolfilter
array([False, True, True])
newdf=df[boolfilter]
Upvotes: 4