Reputation: 743
I have a set s
and a list of set l
as below.
s = {1,2,3,4}
l = [{1}, {1,2,3}, {3}]
The output should be
out = [{1}, {1,2,3}, {3}]
I am using the following code to accomplish it. But I was hoping there would be a faster way? Perhaps some sort of broadcasting?
out = [i.intersection(s) for i in l]
EDIT
List l
can be as long as 1000 elements long.
My end objective is to create a matrix which has the length of elements of the pairwise intersection of elements of l
. So s
is an element of l
.
out_matrix = list()
for s in l:
out_matrix.append([len(i.intersection(s)) for i in l])
Upvotes: 0
Views: 178
Reputation: 81684
My first thought when reading this question was "sure, use numpy
". Then I decided to do some tests:
import numpy as np
from timeit import Timer
s = {1, 2, 3, 4}
l = [{1}, {1, 2, 3}, {3}] * 1000 # 3000 elements
arr = np.array(l)
def list_comp():
[i.intersection(s) for i in l]
def numpy_arr():
arr & s
print(min(Timer(list_comp).repeat(500, 500)))
print(min(Timer(numpy_arr).repeat(500, 500)))
This outputs
# 0.05513364499999995
# 0.035647999999999236
So numpy
is indeed a bit faster. Does it really worth it? not sure. A ~0.02
seconds difference for a 3000 elements list is neglectable (especially if considering the fact that my test didn't even take into account the time it took to create arr
).
Keep in mind that even when using numpy
we are still in the grounds of O(n). The difference is due to the fact that numpy
pushes the for
loop down to the C level, which is inherently faster than a Python for
loop.
Upvotes: 1