Reputation: 34368
I have imported a simple query log into a pandas dataframe in Python (see image), and would like to know what the most efficient way is to extract all of the rows that contain any given keyword that is contained in the 'Keyword' column.
I could iterate over the dataframe - but have a feeling there might be a quicker way using arrays/masks.
Any help greatly appreciated.
Upvotes: 1
Views: 4397
Reputation: 375685
You can use str.contains
, for example:
In [1]: df = pd.DataFrame([['abc', 1], ['cde', 2], ['efg', 3]])
In [2]: df
Out[2]:
0 1
0 abc 1
1 cde 2
2 efg 3
In [3]: df[0].str.contains('c')
Out[3]:
0 True
1 True
2 False
Name: 0, dtype: bool
In [4]: df[df[0].str.contains('c')]
Out[4]:
0 1
0 abc 1
1 cde 2
Upvotes: 5
Reputation: 129018
[3]: df = DataFrame(dict(A = ['foo','bar','bah','bad','bar'],B = range(5)))
In [4]: df
Out[4]:
A B
0 foo 0
1 bar 1
2 bah 2
3 bad 3
4 bar 4
In [5]: select = Series(['bar','bah'])
In [6]: df[df.A.isin(select)]
Out[6]:
A B
1 bar 1
2 bah 2
4 bar 4
Upvotes: 3