user7289
user7289

Reputation: 34368

Extracting rows for a Pandas dataframe in Python

I have imported a simple query log into a pandas dataframe in Python (see image), and would like to know what the most efficient way is to extract all of the rows that contain any given keyword that is contained in the 'Keyword' column.

I could iterate over the dataframe - but have a feeling there might be a quicker way using arrays/masks.

Any help greatly appreciated.

enter image description here

Upvotes: 1

Views: 4397

Answers (2)

Andy Hayden
Andy Hayden

Reputation: 375685

You can use str.contains, for example:

In [1]: df = pd.DataFrame([['abc', 1], ['cde', 2], ['efg', 3]])

In [2]: df
Out[2]:
     0  1
0  abc  1
1  cde  2
2  efg  3

In [3]: df[0].str.contains('c')
Out[3]:
0     True
1     True
2    False
Name: 0, dtype: bool

In [4]: df[df[0].str.contains('c')]
Out[4]:
     0  1
0  abc  1
1  cde  2

Upvotes: 5

Jeff
Jeff

Reputation: 129018

[3]: df = DataFrame(dict(A = ['foo','bar','bah','bad','bar'],B = range(5)))

In [4]: df
Out[4]: 
     A  B
0  foo  0
1  bar  1
2  bah  2
3  bad  3
4  bar  4

In [5]: select = Series(['bar','bah'])

In [6]: df[df.A.isin(select)]
Out[6]: 
     A  B
1  bar  1
2  bah  2
4  bar  4

Upvotes: 3

Related Questions