Thanh Nguyen
Thanh Nguyen

Reputation: 912

Intersection and Complement of multiple lists of ranges

I have a problem like this: I have multiple list of intervals,now I want to get all the possible intersections and complements of it. For example, in case of 2 lists:

a = [[0, 2], [5, 10], [13, 23], [24, 25]]
b = [[1, 5], [8, 12], [15, 18], [20, 24]]

The output should be
output_a = [[0, 1], [1, 2], [5, 8], [8, 10], [13, 15], [15, 18], [18, 20], [20, 23], [24, 25]]
output_b = [[1, 2], [2, 5], [8, 10], [10, 12], [15, 18], [20, 23], [23, 24]]

and so on for n list

For visualization:

    List_a           |-----------|     |----------|
    List_b                  |------------|   |-------|
    List_c           |---------|     |------------|

       Set           |------|--|-|---|-|-|---|----|--|

  Output_a           |------|--|-|     |-|---|----|
  Output_b                  |--|-|---|-|-|   |----|--|
  Output_c           |------|--|     |-|-|---|----|

My pseudocode with Python set()

Get all (start, end) of all intervals in all list into a set()
for each list:
    for intervals in list:
        for each element in set:
            if element in intervals:
                ->store element in dict

But it seems like this is not efficient and elegant, since it requires 3 for loops

Could you please suggest a better algorithm for this case please? Thank you very much

Upvotes: 2

Views: 927

Answers (2)

Guybrush
Guybrush

Reputation: 2780

(Note: I did not fully understand the question and how you compose output_a and output_b. In this "solution", I'll provide the building bricks using a library I developed, I hope it helps, but remember, the output of this "solution" does not match exactly what you provided!)

I'm the maintainer of portion, a Python library providing data structure and operations for intervals (and interval sets). The library notably supports automatic simplification of intervals, which leads to a different output than the one proposed.

Let's import the library, and define your lists:

>>> import portion as P
>>> a = [[0, 2], [5, 10], [13, 23], [24, 25]]
>>> b = [[1, 5], [8, 12], [15, 18], [20, 24]]

Now we convert these lists to (disjunctions of) intervals:

>>> a = P.Interval(*[P.closed(x, y) for x, y in a])
>>> b = P.Interval(*[P.closed(x, y) for x, y in b])
>>> a
[0,2] | [5,10] | [13,23] | [24,25]
>>> b
[1,5] | [8,12] | [15,18] | [20,24]

Let's compute the intersection and its complement:

>>> intersection = a & b
>>> complement = ~intersection
>>> intersection
[1,2] | [5] | [8,10] | [15,18] | [20,23] | [24]
>>> complement
(-inf,1) | (2,5) | (5,8) | (10,15) | (18,20) | (23,24) | (24,+inf)

Since you do not care about infinities, let's restrict the complement to the range of a and b. To do so, we take the enclosure of a | b (the enclosure being the smallest atomic interval containing the given interval):

>>> (a | b).enclosure
[0,25]
>>> complement = complement & (a | b).enclosure
>>> complement
[0,1) | (2,5) | (5,8) | (10,15) | (18,20) | (23,24) | (24,25]

Based on intersection and complement, you have all possible "breakpoints" (and their "origin").

Upvotes: 3

Daniel Junglas
Daniel Junglas

Reputation: 5930

It seems that your example output does not match your picture: According to your picture, the element [24,25] should not be in output_b since b stops at 24. Also, according to the picture, [5,8] should not be in output_b since b does not intersect this interval.

Here is a code that generates the lists according to your picture. The code also has three nested loop but for each list, it goes through the list and the set only once, so the two innermost loops only have linear complexity, not quadratic. The idea is to sort the union of breakpoints and then scan the two lists simultaneously. It assumes that the intervals in the original list do not overlap and are in increasing order.

a = [[0, 2], [5, 10], [13, 23], [24, 25]]
b = [[1, 5], [8, 12], [15, 18], [20, 24]]

lists = [a, b]

# collect all endpoints first into a set and the into a sorted sequence
breaks = set()
for l in lists:
    for i in l:
        breaks.update(i)
breaks = sorted(breaks)

for l in lists:
    output = []
    b = 0
    # For each interval
    for start, end in l:
        # Advance b until it falls into this interval:
        while breaks[b] <= start: b += 1
        # Now collect all sub-intervals from
        while b < len(breaks) and breaks[b] <= end:
            output.append([start, breaks[b]])
            start = breaks[b]
            b += 1
    print(output)

The output is

[[0, 1], [1, 2], [5, 8], [8, 10], [13, 15], [15, 18], [18, 20], [20, 23], [24, 25]]
[[1, 2], [2, 5], [8, 10], [10, 12], [15, 18], [20, 23], [23, 24]]

Upvotes: 2

Related Questions