Reputation: 1454
I have a dataframe in which one column contains tuples:
df = pd.DataFrame({'a':[1,2, 3], 'b':[(1,2), (3,4), (0,4)]})
a b
0 1 (1, 2)
1 2 (3, 4)
2 3 (0, 4)
I would like to select the rows where an element I provide is in the tuple.
For example, return rows where 4 is in a tuple, expect outcome would be:
a b
1 2 (3, 4)
2 3 (0, 4)
I have tried:
print(df[df['b'].isin([4])]
But this returns an empty dataframe:
Empty DataFrame
Columns: [a, b]
Index: []
Upvotes: 6
Views: 2483
Reputation: 210832
You can first convert tuples to sets and then find sets intersections:
In [27]: df[df['b'].map(set) & {4}]
Out[27]:
a b
1 2 (3, 4)
2 3 (0, 4)
it'll also work for multiple values - for example if you are looking for all rows where either 1
or 3
is in a tuple :
In [29]: df[df['b'].map(set) & {1, 3}]
Out[29]:
a b
0 1 (1, 2)
1 2 (3, 4)
Explanation:
In [30]: df['b'].map(set)
Out[30]:
0 {1, 2}
1 {3, 4}
2 {0, 4}
Name: b, dtype: object
In [31]: df['b'].map(set) & {1, 3}
Out[31]:
0 True
1 True
2 False
Name: b, dtype: bool
Upvotes: 1
Reputation: 862611
You need apply
with in
:
print(df[df['b'].apply(lambda x: 4 in x)])
a b
1 2 (3, 4)
2 3 (0, 4)
Upvotes: 2