deepAgrawal
deepAgrawal

Reputation: 743

Intersect a set with a list of sets in python

I have a set s and a list of set l as below.

s = {1,2,3,4}
l = [{1}, {1,2,3}, {3}]

The output should be

out = [{1}, {1,2,3}, {3}]

I am using the following code to accomplish it. But I was hoping there would be a faster way? Perhaps some sort of broadcasting?

out = [i.intersection(s) for i in l]

EDIT

List l can be as long as 1000 elements long.

My end objective is to create a matrix which has the length of elements of the pairwise intersection of elements of l. So s is an element of l.

out_matrix = list()
for s in l:
    out_matrix.append([len(i.intersection(s)) for i in l])

Upvotes: 0

Views: 178

Answers (1)

DeepSpace
DeepSpace

Reputation: 81684

My first thought when reading this question was "sure, use numpy". Then I decided to do some tests:

import numpy as np
from timeit import Timer

s = {1, 2, 3, 4}
l = [{1}, {1, 2, 3}, {3}] * 1000  # 3000 elements
arr = np.array(l)


def list_comp():
    [i.intersection(s) for i in l]


def numpy_arr():
    arr & s

print(min(Timer(list_comp).repeat(500, 500)))
print(min(Timer(numpy_arr).repeat(500, 500)))

This outputs

# 0.05513364499999995
# 0.035647999999999236

So numpy is indeed a bit faster. Does it really worth it? not sure. A ~0.02 seconds difference for a 3000 elements list is neglectable (especially if considering the fact that my test didn't even take into account the time it took to create arr).

Keep in mind that even when using numpy we are still in the grounds of O(n). The difference is due to the fact that numpy pushes the for loop down to the C level, which is inherently faster than a Python for loop.

Upvotes: 1

Related Questions