Reputation: 51
I have two sets with strings, and I want to be able to compare set1 elements with set2 elements, and output a sum of the matching elements. If I can avoid a loop with this, that would be preferred as well. The idea is like this:
set1 = ['some','words','are','here']
set2 = ['some','words','are','over','here','too']
The function I'm looking for would output a 4 here - returning True for all elements in set1 contained in set2. A likewise function in R would be
sum(set1 %in% set2)
But I can't find an equivalent in Python. Let me know if any of you guys can help. Cheers
Upvotes: 1
Views: 348
Reputation: 95948
First, you do not have a set
objects, you have list
objects:
>>> set1 = ['some','words','are','here']
>>> set2 = ['some','words','are','over','here','too']
>>> type(set1), type(set2)
(<class 'list'>, <class 'list'>)
>>>
Python supports set-literals which look like with curly braces:
>>> set1 = {'some','words','are','here'}
>>> set2 = {'some','words','are','over','here','too'}
>>> type(set1), type(set2)
(<class 'set'>, <class 'set'>)
Python set
objects overload the bitwise operators to perform set-operations. You want the number of elements in the set intersection, so use the bit-wise and operator:
>>> set1 & set2
{'are', 'here', 'words', 'some'}
>>> len(set1 & set2)
4
Alternatively, you can use a more object-oriented style:
>>> set1.intersection(set2)
{'are', 'here', 'words', 'some'}
>>> len(set1.intersection(set2))
4
I prefer the operators, personally:
>>> set1 & set2 # intersection
{'are', 'here', 'words', 'some'}
>>> set1 | set2 # union
{'some', 'here', 'words', 'too', 'over', 'are'}
>>> set1 - set2 # difference
set()
>>> set2 - set1 # difference
{'too', 'over'}
>>> set2 ^ set1 # symmetric difference
{'over', 'too'}
If you have list
objects, just convert to a set
:
>>> l1 = ['some','words','are','here']
>>> l2 = ['some','words','are','over','here','too']
>>> set(l1).intersection(l2)
{'some', 'are', 'words', 'here'}
>>> len(set(l1).intersection(l2))
4
Upvotes: 2