Reputation: 31
I have a pandas DataFrame that contains lists as entries
data = {'col1': [
['foo', 'bar', 'baz'],
['cat', 'dog', 'horse'],
[1, 2, 3]
]}
df = pd.DataFrame(data)
I then want to return rows using boolean mask IF 'foo' is in the list of any row (in this case, row 0). The following will return an empty DataFrame:
df[df['col1'] == 'foo']
The best way I can achieve the above is the following:
df[df['col1'].apply(lambda x: True if 'foo' in x else False)]
but I feel like there is a way to simplify this code. Any suggestions?
Upvotes: 3
Views: 1623
Reputation: 449
As Henry already posted in the comments, you can shrink the code, if you use 'foo' in x
inside lambda.
To me, this looks pythonic enough.
The complete line would be
df[df["col1"].apply(lambda x: 'foo' in x)]
If you want to avoid the lambda expression you can use:
def inside(my_list, key): return key in my_list
out = df[df["col1"].apply(inside, key="foo")]
This uses a function defined in advance, which could be extended. This is not possible with the lambda expression.
Upvotes: 1