Reputation: 1446
If I want to filter a column of strings for those that contain a certain term I can do so like this:
df = pd.DataFrame({'col':['ab','ac','abc']})
df[df['col'].str.contains('b')]
returns:
col
0 ab
2 abc
How can I filter a column of lists for those that contain a certain item? For example, from
df = pd.DataFrame({'col':[['a','b'],['a','c'],['a','b','c']]})
how can I get all lists containing 'b'?
col
0 [a, b]
2 [a, b, c]
Upvotes: 20
Views: 21100
Reputation: 52246
You can use apply, like this.
In [13]: df[df['col'].apply(lambda x: 'b' in x)]
Out[13]:
col
0 [a, b]
2 [a, b, c]
Although generally, storing lists in a DataFrame
is a bit awkward - you might find some different representation (columns for each element in the list, MultiIndex, etc) that is easier to work with.
Upvotes: 35