n00bc0d3r
n00bc0d3r

Reputation: 275

Maximum intersecting intervals

I have the following task hand. I have been given N rules composed of 3 intervals each.

A1 B1 C1 D1 E1 F1
A2 B2 C2 D2 E2 F2
A3 B3 C3 D3 E3 F3
.
.
.

An Bn Cn Dn En Fn

Each rule is comprised of three different intervals, [a,b] [c,d] [e,f].

Let us consider a input comprising of three values x, y and z. This input will satisfy a rule if a<=x<=b and c<=y<=d and e<=z<=f. The task is to find the count of maximum number of rules which can be satisfied through any arbitrary input.

I tried to solve the problem using brute force technique. It is O(N^2) solution. I check each rule with every other rule to find if they intersect with each other. I maintain a list where I keep record of the rule numbers which intersect with the current rule. I output the maximum size using this record. But I am getting wrong answer using this approach. Please help me by providing a hint about what I might be doing wrong.

Example:

Suppose that we are give 5 rules:

Rule 1: 39 40 43 55 28 42
Rule 2: 6 15 36 43 12 56
Rule 3: 38 57 3 15 17 36
Rule 4: 3 15 36 60 21 52
Rule 5: 24 45 23 34 27 39

Here, the we can compute that only Rule 2 and Rule 4 can be simultaneously satisfied through any input. Therefore, answer = 2.

Upvotes: 1

Views: 226

Answers (4)

Rafael Baptista
Rafael Baptista

Reputation: 11499

I guess I misunderstood the question. In that case, you can solve this via binary space partitioning.

http://en.wikipedia.org/wiki/Binary_space_partitioning

It helps to imagine each of your rules as defining a cubic space. ( which I bet they do ) Chose a rule at random and then a parameter. Use that parameter to divide the space. All rules completely contained on one side of the partition go into one group, all rules on the other side into another group. If a rule is on both sides, you put it in both groups, but adjusts its parameters to be on just one side.

For example, rules:

5,8,x,x,x,x
1,3,x,x,x,x
7,10,x,x,x,x
6,9,x,x,x,x

You chose rule 3, parameter A, with value 6. Bisect the rule space:

1,3,x,x,x,x
5,6,x,x,x,x

7,10,x,x,x,x
6,9,x,x,x,x
6,8,x,x,x,x

So rule 0 got divided in half, and put into both spaces. Do this process recursively for each group until all the rules in the group are identical. In each recursion chose the parameters in the following order A C E B D F. That is you divide the space along the first constraint, then the next division in that space is on the second constraint etc.

I'm not sure if it would be more efficient to BSP breadth first or depth first. When you find a space that resolves with a higher rule count than any previous space, and the you can stop recursing down any paths that have that many rules or fewer. ( .e.g If I know of one subspace that has 5 rules, if I have an unresolved space with 5 rules I don't have to do down any further)

When you have partitioned all the spaces until they each have identical rules, the partition with the most rules is your max count.

Upvotes: 0

Aivean
Aivean

Reputation: 10882

You approach is invalid, because when Rule1 intersects with Rule2 and Rule1 intersects with Rule3 this doesn't mean that Rule2 intersects with Rule3.

Example:

 Rule1:   [1,2] [3,4] [5,6]
 Rule2:   [1,1] [3,3] [5,5]
 Rule3:   [2,2] [4,4] [6,6]

In your solution when you fix Rule1 in outer loop then find Rule2 and Rule3 in nested loop and get the result of 3, while correct answer is 2.

I have a working solution in mind, but it's not very efficient: O(K3 * N) where K comes from the domain of definition of rules' intervals (K = max_interval_value - min_interval_value) and N is the number of rules. Also it consumes O(K * N) memory. If this setup fits problem's time and memory limits, I can describe my solution, if not, I'll rather will not waste my time.

Update:

Ok, here we go. For simplicity I will assume that minimum number that appears in rules is 0 and the maximum is K. So there are only K+1 possible integer numbers that can satisfy at least one rules' interval.

As each rule has exactly three intervals, you have to create three arrays that will correspond to these intervals. Array's element at index i will contain the set of rules that include number i in their corresponding interval. Each array has length of K+1.

For example, if we have rules:

 A: [1,2][3,4][5,6]
 B: [1,1][2,2][6,6]
 C: [2,2][4,4][5,5]

Then first array (that corresponds to intervals at position 1) will be:

 0:{}, 1:{A,B}, 2:{A,C}, 3:{}.... rest are empty sets

Second array corresponds to intervals at second position:

 0:{}, 1:{}, 2:{B}, 3:{A}, 4:{A, C}, 5:{}, 6:{}

Third array will be:

 ...all empty... 5: {A, C}, 6: {A, B}

Now how to fill these arrays. You start with empty sets in each element. Then you go through all rules. For each rule you iterate through all values that lie in corresponding interval and add current rule into appropriate set. For example, for rule A for first interval you need to iterate through values 1,2 and add A into first array at indexes 1,2. This part takes 3 * K * N steps (assuming adding element to set is O(1))

What you do next is basically trying to figure an input that will satisfy most rules. You need three nested loops, say, i for first interval, j for second interval, k for third. You need to iterate through all possible values from 0 to K. And for each combination of i,j,k you need to intersect three sets, each taken from corresponding array. So, returning to the example, for i=2, j=4, k=5 you will need to intersect {A, C} with {A, C} with {A, C}, and this will result actually largest set of rules. For i=1, j=3, k=5 you will intersect {A,B} with {A} and {A,C} and get only {A}. So, you iterate, calculating intersection and finding maximal set of rules. The complexity of this part is O( K3 * N) where K3 comes from nested loops and N is the cost of set intersection.

That's all. I haven't thought much on this solution, so maybe it can be further optimized. But at least it is polynomial.

Update

Now how to eliminate one nested loop.

When you have some i and j and already computed intersection of groups from first two arrays, you get virtually new task. Now you have set of groups and you need to find maximal subset of this set that will satisfy only one remained interval (third). What I propose is to iterate all possible values from 0 to K and for each value maintain the set of groups that are satisfied by that value. In order to do that you have to sort initial subset of groups in two ways, first by first interval's boundary, second by second interval's boundary. Lets call first sorted array add_order, second remove_order. (Later you see that these arrays can be replaced by queues for simplicity).

If you have three groups in you initial subset:

 A: [..][..][2,4]
 B: [..][..][2,2]
 C: [..][..][1,3]

They will be ordered as this: C, A, B for add_order.

 C: [..][..][1,3]
 A: [..][..][2,4]
 B: [..][..][2,2]

And B, C, A for remove_order:

 B: [..][..][2,2]
 C: [..][..][1,3]
 A: [..][..][2,4]

You need three pointers in your loop. First (g_add) will point (in array add_order) at the group you are about to "add" to your maintained set, second (g_remove) will point (in array remove_order) at the group you are about to remove from your set. And the last, k will hold current value (from 0 to K).

Returning to example, you will iterate k from 0 to 5 (g_add and g_remove initially pointing on first group of corresponding array):

 k=0, g_add=C, g_remove=B, res={} // nothing to do
 k=1, g_add=C, g_remove=B, res={} // `g_add` is pointing on the group whose first boundary is less or equal to `k`, adding this group to set of current groups, and incrementing `g_add`
 k=1, g_add=A, g_remove=B, res={C} // moving on, incrementing `k`
 k=2, g_add=A, g_remove=B, res={C} // now A satisfies adding condition
 k=2, g_add=B, g_remove=B, res={C, A} // B satisfies adding condition too
 k=2, g_add=, g_remove=B, res={C, A, B} // incrementing k (note that at that point we have largest set)
 k=3, g_add=, g_remove=B, res={C, A, B} // B has second boundary less than k, removing
 k=3, g_add=, g_remove=C, res={C, A} // B has second boundary less than k, removing
 k=3, g_add=, g_remove=C, res={C, A} // k++
 k=4, g_add=, g_remove=C, res={C, A} // C is suitable for removing
 k=4, g_add=, g_remove=A, res={A} // k++
 k=5, g_add=, g_remove=A, res={A} // A is suitable for removing
 k=5, g_add=, g_remove=, res={} // we are done

As I said before, you can consider remove_order and add_order as a queues and their heads will be add and remove candidates correspondingly.
Now about complexity. Iterate all pairs of i and j takes O(K2). Now for each such pair you do sets intersection O(N), you don't need to sort the groups, as this can be done at very beginning, before the loops. Then you need to iterate k, adding and removing groups. As each group is processed only twice (when adding and removing) and k is iterated from 0 to K the complexity of this loop will be O(N + K).
So for the whole algorithm we get O(K2 * (N + K)) that, considering K << N roughly will be O(K2 * N).

Upvotes: 1

Rafael Baptista
Rafael Baptista

Reputation: 11499

I'm not completely sure about what you are asking. I'm assuming you want an algorithm that given an arbitrary input x,y,z, will output the count of rules that match. It is just as easy to output the list of rules themselves.

Transform your rule set into a pair of arrays for each value A,B,C,D,E,F. So for the values of A the arrays are A_val, and A_rule. In A_val you put all the values of A, and in A_rule you put all the rule indexes corresponding to those values.

E.g. if you rules were

1,x,x,x,x,x
5,x,x,x,x,x
3,x,x,x,x,x

then the arrays would be:

A_val = [ 1,5,3 ];
A_rule = [ 0, 1, 2 ];

Now sort the two arrays using A_val as the sort key:

A_val = [ 1,3,5 ];
A_rule = [ 0, 2, 1 ];

You do this for all 6 values, so you will have arrays: A_val, B_val, C_val etc. and A_rule, B_rule, C_rule etc.

Now for any input x,y,z,

  1. do a binary search into A_val looking for x. From that index to the end, A_rule will contain the list of rules where A satisfies x.

  2. do a binary search into B_val looking for x. From that index to the beginning of the B_rule array are all the rules where B satisfies x.

Do this for each of the 6 intervals. You will have 6 sets of rule ids that satisfy each of the 6 conditions. Now just find the intersection of those 6 sets. There are a bunch of set intersection algorithms.

Look here: Efficient list intersection algorithm

An easy one would be to put all the rule ids into a single array and sort that array. Any rule id that occurs in the sorted array 6 times in a row is a rule that satisfies all 6 conditions.

Upvotes: 0

gTcV
gTcV

Reputation: 2504

Intersecting is not a transitive relation: If you have A = [0,2], B = [1,3], C = [2,4], (A,B) and (B,C) intersect, but not (A,C). So the answer is that a point can lie in at most one intervals, even though B intersects with two.

A proper brute-force solution would be to compute the intersection of all subsets of your rules and pick the largest subset with non-zero intersection. But that would scale as O(2^N), then.

Btw, the problem is known as the maximum interval stabbing problem (or some variant thereof). You can find reference for it on Google.

Upvotes: 0

Related Questions