Christopher Thompson
Christopher Thompson

Reputation: 3733

Set intersection that considers values within a certain range as intersecting

I'm trying to compute the intersection of 2 sets in Python and I'd like to consider values that are within a certain range (+/- 5) as being equal. So for example:

set1 = [22, 570, 233, 127, 92]

set2 = [897, 27, 673, 231, 45]

What I want is for:

len[x for x in set1 if x in set2] == 2

where the intersections are 22 & 27 as well as 233 & 231.

Is there an efficient way of doing this?

Upvotes: 1

Views: 154

Answers (2)

ShadowRanger
ShadowRanger

Reputation: 155418

At the expense of memory, for smallish ranges (your example said +-5), you could just make larger sets for the check.

from itertools import chain

set1 = {22, 570, 233, 127, 92}
set2 = {897, 27, 673, 231, 45}

# Make a frozenset of all values considered overlapping with set2
overlap = 5
overlaprange = range(-overlap, overlap+1)
match2 = frozenset(chain.from_iterable((x + i for i in overlaprange) for x in set2))

print(len(set1 & match2))

This will never count a single element more than once, and if the values in set2 are close together, doesn't store their overlap more than once (conversion to frozenset for match2 uniquifies). If one of the sets is being matched against several other sets, you'd convert the reused set to a "match set", then use that over and over for each intersection check.

Upvotes: 0

Kevin
Kevin

Reputation: 76194

If by "efficient" you mean "short", sure.

>>> set1 = [22, 570, 233, 127, 92]
>>> set2 = [897, 27, 673, 231, 45]
>>> len([x for x in set1 if any(abs(x-y) <= 5 for y in set2)])
2

If by "efficient" you mean "with a reasonable big-O run time", that's not so easy. The above approach is O(N^2). You could get O(N * log(N)) if you sort both sets and iterate over them in parallel looking for approximate matches, but it's not entirely trivial to do.

[edit - removed O(N log N) code that worked for the OP's input but not for set1=[1,3,4] and set2=[2,10]]

Upvotes: 1

Related Questions