GeorgeMenezes
GeorgeMenezes

Reputation: 93

Find total number of intersections between a given set of ranges in python

What is the best way to count number of intersections of a given set of ranges.

For ex: consider a list of range pairs[start,stop]

[[1,5], [3,7], [9,11], [6,8]]

Here there are total 2 intersections ,

[1,5] intersects with [3,7]

and [3,7] intersects with [6,8]

Upvotes: 2

Views: 1654

Answers (6)

LevB
LevB

Reputation: 953

The "portion" module might be helpful when you work with segments.

import portion as P
from itertools import combinations
li = [[1,5], [3,7], [9,11], [6,8]]

pairs=[el for el in combinations(li,2) if P.open(*el[0]) & P.open(*el[1]) != P.empty()]

print(pairs)

The output is a list of pairs that intersect:

[([1, 5], [3, 7]), ([3, 7], [6, 8])]

Upvotes: 0

Carlos
Carlos

Reputation: 6021

This problem can be done in nlogn time, of course you can do it in n^2 but it sounds like you want it time optimal.

I'd call these interval overlaps, and it' a classic you'll find variants of it in many interview books. Here's how to do it:

  1. Sort the items by the starts.
  2. Now step through the items. Store a minheap for each item of where it ends, costs nlogn but doesn't matter, we already paid that. At each new start number you know how many of the previous intersections are overlapping. Remove the intervals that you've run past the end of.

Ends up being nlogn due to the search, doesn't matter that you're then stepping through in n time.

Example:

  1. Sort to [1, 5], [3, 7], [6, 8], [9, 11]
  2. Store a heap item that says it ends at 5.
  3. Get to the second item. Heap is size 1, so add 1 to the overlap count. Add a 7 to the heap.
  4. Get to the third item. Drop the 5 from the heap. Leave the 7, so add 1 again. Add the 8 to the heap.
  5. Get to the 4th item, Drop the 7 and the 8, leaving an empty heap. Result is 2.
import heapq
import operator

def mysol(v):

    overlaps = 0
    minheap = []

    vsorted = sorted(v, key=operator.itemgetter(0))

    for i in range(len(vsorted)):
        while len(minheap) > 0 and minheap[0] < vsorted[i][0]:
            heapq.heappop(minheap)

        overlaps += len(minheap)
        heapq.heappush(minheap, vsorted[i][1])

    return overlaps

Upvotes: 4

Tadhg McDonald-Jensen
Tadhg McDonald-Jensen

Reputation: 21453

The lazy man's way would be to just generate ranges and check for set intersection, this is terrible for memory usage and not extendable to non integers, but it is short:

import itertools
def range_intersection(a,b):
    return len(set(range(*a)) & set(range(*b))) > 0

data = [[1,5], [3,7], [9,11], [6,8]]

for a,b in itertools.combinations(data, 2):
    if range_intersection(a,b):
        print(a,b)

Here set(range(*a)) uses the start and end arguments to range then makes a set so it can be intersected with another range, then we just check if the length of intersection is greater than 0 (there are common elements). itertools.combinations is to simplify checking all combinations of data together.

Upvotes: 0

Chris Charley
Chris Charley

Reputation: 6573

Using the intspan module, a solution could be:

>>> from itertools import combinations
>>> from intspan import intspan

>>> L = [[1,5], [3,7], [9,11], [6,8]]
>>> for s1, s2 in combinations(L, 2):
    if intspan.from_range(*s1) & intspan.from_range(*s2):
        print(s1, 'intersects', s2)

Prints:

[1, 5] intersects [3, 7]
[3, 7] intersects [6, 8]

Upvotes: 1

DarrylG
DarrylG

Reputation: 17156

Modification of algorithm to find Maximum Number of Overlaps to compute number of overlaps instead.

Approach

  • The idea is to store coordinates in a new vector of pair mapped with characters ‘x’ and ‘y’, to identify coordinates.
  • Sort the vector.
  • Traverse the vector, if an x coordinate is encountered it means a new range is added, so increment overlap count
  •  If count > 1, we have a new overlap
         so increment number of overlaps
    
  • if y coordinate is encountered that means a range ends, so decrement overlap count
  • The result is the number of overlaps

The algorithm complexity is O(n*log(n)) (from sort)

Code

def overlap(v): 
    # variable to store the maximum 
    # count 
    ans = 0
    count = 0
    data = [] 
  
    # storing the x and y 
    # coordinates in data vector 
    for i in range(len(v)): 
  
        # pushing the x coordinate 
        data.append([v[i][0], 'x']) 
  
        # pushing the y coordinate 
        data.append([v[i][1], 'y']) 
  
    # sorting of ranges 
    data = sorted(data) 
  
    # Traverse the data vector to 
    # count number of overlaps 
    for i in range(len(data)): 
  
        # if x occur it means a new range 
        # is added so we increase count 
        if (data[i][1] == 'x'): 
            count += 1
  
            if count > 1:
              ans += (count - 1) # new range intersets
                                 # count - 1 existing
                                 # ranges 


        # if y occur it means a range 
        # is ended so we decrease count 
        if (data[i][1] == 'y'): 
            count -= 1
  
  
    # Return number of overlaps 
    return ans
  

Tests

v = [ [ 1, 2 ], [ 2, 4 ], [ 3, 6 ] ] 
print(overlap(v)) # Output 2

v = [[1,5], [3,7], [9,11], [6,8]]
print(overlap(v)) # Output 2

v = [[1,5], [3,7], [9,11], [6,8], [1, 11]]
print(overlap(v))  # Output 6

v = [ [ 1, 3 ], [ 2, 7 ], [3, 5], [4, 6] ] 
print(overlap(v))  # Output 5

Upvotes: 2

Akshay Sehgal
Akshay Sehgal

Reputation: 19322

Try this one-liner simple method which uses a list comprehension and itertools.combinations -

  1. Iterate over combinations of lists
  2. Turn them into ranges and then sets of those ranges
  3. Take an intersection
  4. Only return those with intersection = True
import itertools
r = [[1,5], [3,7], [9,11], [6,8]]

overlapping_ranges = [i for i in list(itertools.combinations(r, 2))\
                      if set(range(*i[0])).intersection(set(range(*i[1])))]

print('Count of overlapping ranges:',len(overlapping_ranges))
print(overlapping_ranges)
Count of overlapping ranges: 2
[([1, 5], [3, 7]), ([3, 7], [6, 8])]

Upvotes: 2

Related Questions