Neptune
Neptune

Reputation: 627

Java algorithm for find intersection between intervals

I have a time interval like this:

[5,10]

and I have more list of time point, with different length, for example:

t1=[3,6,9,10] 
t2=[2,4,5,6,10]
..

where in t1 [3,6] is the first interval, [6,9] the second and so on.

Same thing for t2 and other list.

Now I need to save the list, and the specific interval, that intersect with first time interval. For example in t1 I have [3,6] that intersect on [5,10], [6,9] that intersect with [5,10], etc.

I have made an algorithm but I work with more data and I need a fast algorithm. For example if I work with 300.000 list and every list have 200 time points my algorithm 1 fine in about 5-10sec. But if I have 10.000 or more time point the algorithm is very slow.

My algorithm is like this:

First time interval <- [t1,t2]
For each list
  For each time interval [s1,s2] in list
     if(s1>= t1 && s2 <= t2)
     {
         saveIntervall()
     }
     else if (s1<= t2 && s2 > t2)
     {
         saveIntervall()
     }
     else if(s1 <  t1 && s2 >= t1)
     {
        saveIntervall()
     }
     else if (s1 < t1 && s2 > t2)
     {
        saveIntervall()
     }

Edit1: Yes I have ordered list.

Edit2: I have another problem, when i find the intersaction i need to calclulate a score between 0 and 1 of how the intersection is large. I use this:

            double score = 0.0d;
        if(s1>= t1 && s2 <= t2)
        {
            score = (s2-s1) / (t2-t1);
        }
        else if(t2 >= s2 && t1 > s1)
        {
            score = (s2-t1) / (t2-t1);
        }
        else if(t2 < s2 && t1 <= s1)
        {
            score = (t2-s1) / (t2-t1);
        }
        else
        {
            score = 1;
        }

I can speed up this too?

Upvotes: 8

Views: 7105

Answers (3)

dimo414
dimo414

Reputation: 48884

First and foremost, your data structure is confusing - if you're trying to talk about discrete intervals of time, structure your data like so; for instance int[][] where the inner array is always length 2, so your t1 becomes:

int[][] t1 = {{3,6}, {6,9}, {9,10}};

Using the right structure will probably help you simplify your algorithm and make it easier to work with.


Better than properly structured arrays, however, would be to use a dedicated type to represent these intervals, such that you could pass around List<Interval> objects and do some sort of contains check on them. But don't reinvent the wheel. The awesome Guava library provides a robust Range class that you can use. Even better though, it also provides RangeSet and RangeMap classes, which let you easily do the things you're talking about. See also their Ranges Explained article which covers the basics.

Note that you could pretty easily transform your current design into Range objects internally, if you can't redesign the array structure externally.

Having tried at one point to build my own IntervalSet class, let me tell you that it's a tricky problem to get right and you'll save yourself a lot of headaches using their well designed and highly tested range utilities.

Here's the way that I would do what you're describing with Guava - notice that we avoid even needing to think about the math involved - Range does the right thing for us:

/**
 * Given a Range and an group of other Ranges, identify the set of ranges in
 * the group which overlap with the first range.  Note this returns a Set<Range>
 * not a RangeSet, because we don't want to collapse connected ranges together. 
 */
public static <T extends Comparable<?>> Set<Range<T>>
        getIntersectingRanges(Range<T> intersects, Iterable<Range<T>> ranges) {
    ImmutableSet.Builder<Range<T>> builder = ImmutableSet.builder();
    for(Range<T> r : ranges) {
        if(r.isConnected(intersects) && !r.intersection(intersects).isEmpty()) {
            builder.add(r);
        }
    }
    return builder.build();
}

/**
 * Given a 2-length array representing a closed integer range, and an array of
 * discrete instances (each pair of which therefore represents a closed range)
 * return the set of ranges overlapping the first range.
 * Example: the instances array [1,2,3,4] maps to the ranges [1,2],[2,3],[3,4].
 */
public static Set<Range<Integer>> getIntersectingContinuousRanges(int[] intersects,
        int[] instances) {
    Preconditions.checkArgument(intersects.length == 2);
    Preconditions.checkArgument(instances.length >= 2);
    ImmutableList.Builder<Range<Integer>> builder = ImmutableList.builder();
    for(int i = 0; i < instances.length-1; i++) {
        builder.add(Range.closed(instances[i], instances[i+1]));
    }
    return getIntersectingRanges(Range.closed(intersects[0], intersects[1]),
                                 builder.build());
}

Using your examples:

public static void main(String[] args)
{
    int[] interval = {5,10};
    int[] t1 = {3,6,9,10};
    int[] t2 = {2,4,5,6,10};

    System.out.println(getIntersectingContinuousRanges(interval, t1));
    System.out.println(getIntersectingContinuousRanges(interval, t2));
}

The above prints out:

[[3‥6], [6‥9], [9‥10]]
[[4‥5], [5‥6], [6‥10]]

Upvotes: 3

Codor
Codor

Reputation: 17605

Given left boundary and length of two intervals, intersection can be testedby the following code.

protected boolean intervalOverlap( int pos1, int length1, int pos2, int length2 ){
  int pos_A_left  = pos1;
  int pos_A_right = pos1 + length1;
  int pos_B_left  = pos2;
  int pos_B_right = pos2 + length2;
  return pos_B_left < pos_A_right && pos_A_left < pos_B_right;
}

There is a short article in which this approach is discussed. Furthermore, an alternative representation of intervals (using center and length) is presented, for which intersection test can be implemented more efficiently.

Upvotes: 0

odedsh
odedsh

Reputation: 2624

The intersection of two intervals [s1, s2] and [t1, t2] is empty if and only if:

    t2 < s1 or s2 < t1

So for two intervals to check if the two are intersecting or not you need to do only the test above.

Also once you know that s2 < t1 then there is no point to continue further on the list that brought t1 since the larger intervals will never intersect, which means you should move on.

Naive Psuedo Algorithm:

   given [s1, s2]
   for each list [t1, t2, ... t(n)] in search_lists
        for each interval [t(x), t(x+1)] from [t1, t2, ... t(n] (x goes from 0 to n-1)
           if t(x+1) < s1
              continue
           if s2 < t(x)
              break
           saveInterval()

This can be improved quite a bit to really use the fact that [t1, t2, .. , t(n)] is sorted.

first note that [s1, s2] will intersect with [t(x), t(x+1)] iff t(x+1) >= s1 and s2 >= t(x)

However

if t(x) >= s1 then for every h>0      `t(x+h) >= s1` 

also

if s2 >= t(x) then for every h>0  `s2 >= t(x-h)`

so if we find the smallest i so that t(i+1)>=s1 then all the intervals from [t(i), t(i+1)] on-wards meet the first condition of intersection; i.e. ([t(i+1), t(i+2)] , [t(i+2), t(i+3)] ...)

and if we find the largest j so that s2 >= t(j-1) then all the intervals from [t(j-1), t(j)] backwards meet the second condition . i.e. ([t(j-2), t(j-1)], [t(j-3), t(j-2)] ...)

All the intervals between i and j meet both criteria and only them.

So the final algorithm is:

given [s1, s2]
for each list [t1, t2, ... t(n)] in search_lists
    find the smallest i such that t(i+1)>=s1  
    find the biggest  j such that s2>= t(j-1)

    if j>i then all the intervals between `{t(i)... t(j)}` intersect with [s1, s2]
    otherwise there is no intersection.       

Since {t1, t2, t3...t(n)} is sorted we can use binary search to find the indices i and j efficiently

EDIT2:

The intersection of [s1,s2] and [t1, t2] is:
[max(s1, t1), min(s2,t2)]

the sizes are: L1 = s2-s1 L2 = t2-t1 L3 = min(s2,t2) - max(s1,t1)

The score you are looking for is: L3/ min(L2, L1) a score between 0 and 1.

(min(s2,t2) - max(s1,t1)) / ( min(s2-s1, t2-t1) )

The cost of calculating this is 3 tests, 3 minus operations and one floating point operation. But I am assuming the intervals are valid and the intersection exists otherwise more tests are needed. (s2>s2, t2>t1 and min(s2,t2) > max(s1,t1). The final test is the same iff condition for intersection from the discussion above.

Upvotes: 5

Related Questions