Reputation: 1253
I have two sets:
a = set(['this', 'is', 'an', 'apple!'])
b = set(['apple', 'orange'])
I want to find if there are any (b) in (a) including substrings. normally I would do:
c = a.intersection(b)
However, in this example it would return an empty set as 'apple' != 'apple!'
Assuming I cannot remove characters from (a) and hopefully without creating loops, is there a way for me to find a match?
Edit: I would like for it to return a match from (b) e.g. I would like to know if 'apple' is in set (a), I do not want it to return 'apple!'
Upvotes: 6
Views: 2417
Reputation: 180441
Using sets is actually of little benefit if you are not searching for exact matches, if the words always start with the same substring, sorting and bisecting will be a more efficient approach i.e O(n log n)
vs O(n^2)
:
a = set(['this', 'is', 'an', 'apple!'])
b = set(['apple', 'orange'])
srt = sorted(a)
from bisect import bisect
inter = [word for word in b if srt[bisect(srt, word, hi=len(a))].startswith(word)]
Upvotes: 1
Reputation: 52163
Instead of doing the equality check via ==
, you can use in
for substring match which also covers equality:
>>> [x for ele in a for x in b if x in ele]
["apple"]
Upvotes: 7
Reputation: 36033
The best thing to do is:
any(x in y for x in b for y in a)
It's a loop, but you can't escape that. Any solution will at least have an implied loop somewhere.
Upvotes: 0