LookIntoEast
LookIntoEast

Reputation: 8798

Maximum subsets of intervals that does not exceed coverage limit?

Here's one coding question I'm confused about.

Given a 2-D array [[1, 9], [2, 8], [2, 5], [3, 4], [6, 7], [6, 8]], each inner array represents an interval; and if we pile up these intervals, we'll see:

1 2 3 4 5 6 7 8 9
  2 3 4 5 6 7 8
  2 3 4 5 
    3 4   
          6 7
          6 7 8

Now there's a limit that the coverage should be <= 3 for each position; and obviously we could see for position 3, 4, 6, 7, the coverage is 4.

Then question is: maximally how many subsets of intervals can be chosen so that each interval could fit the <=3 limit? It's quite clear that for this case, we simply remove the longest interval [1, 9], so maximal subset number is 6 - 1 = 5.

What algorithm should I apply to such question? I guess it's variant question to interval scheduling?

Thanks

Upvotes: 4

Views: 187

Answers (2)

Said A. Sryheni
Said A. Sryheni

Reputation: 757

I think you can solve this problem using a sweep algorithm. Here's my approach:

The general idea is that instead of finding out the maximum number of intervals you can choose and still fit the limit, we will find the minimum number of intervals that must be deleted in order to make all the numbers fit the limit. Here's how we can do that:

First create a vector of triples, the first part is an integer, the second is a boolean, while the third part is an integer. The first part represents all the numbers from the input (both the start and end of intervals), the second part tells us whether the first part is the start or the end of an interval, while the third part represents the id of the interval.

Sort the created vector based on the first part, in case of a tie, the start should come before the end of some intervals.

In the example you provided the vector will be:

1,0 , 2,0 , 2,0 , 2,0 , 3,0 , 4,1 , 5,1 , 6.0 , 6.0 , 7,1 , 8,1 , 8,1 , 9,1

Now, iterate over the vector, while keeping a set of integers, which represents the intervals that are currently taken. The numbers inside the set represent the ends of the currently taken intervals. This set should be kept sorted in the increasing order.

While iterating over the vector, we might encounter one of the following 2 possibilities:

  1. We are currently handling the start of an interval. In this case we simply add the end of this interval (which is identified by the third part id) to the set. If the size of the set is more than the limit, we must surely delete exactly one interval, but which interval is the best for deleting? Of course it's the interval with the biggest end because deleting this interval will not only help you reduce the number of taken intervals to fit the limit, but it will also be most helpful in the future since it lasts the most. Simply delete this interval from the set (the corresponding end will be last in the set, since the set is sorted in increasing order of the end)

  2. We are currently handling the end of an interval, in this case check out the set. If it contains the specified end, just delete it, because the corresponding interval has come to its end. If the set doesn't contain an end that matches the one we are handling, simply just continue iterating to the next element, because this means we have already decided not to take the corresponding interval.

If you need to count the number of taken intervals, or even print them, it can be done easily. Whenever you handle the end of an interval, and you actually find this end at the set, this means that the corresponding interval is a taken one, and you may increment your answer by one, print it or keep it in some vector representing your answer.

The total complexity of my approach is : N Log(N), where N is the number of intervals given in the input.

Upvotes: 1

Koray
Koray

Reputation: 1796

I hope I have understood the question right. This is the solution I could able to get with C#:

                //test
                int[][] grid =  { new int[]{ 1, 9 }, new int[] { 2, 8 }, new int[] { 2, 5 }, new int[] { 3, 4 }, new int[] { 6, 7 }, new int[] { 6, 8 } };
                SubsetFinder sf = new SubsetFinder(grid);
                int t1 = sf.GetNumberOfIntervals(1);//6
                int t2 = sf.GetNumberOfIntervals(2);//5
                int t3 = sf.GetNumberOfIntervals(3);//5
                int t4 = sf.GetNumberOfIntervals(4);//2
                int t5 = sf.GetNumberOfIntervals(5);//0


        class SubsetFinder
        {
            Dictionary<int, List<int>> dic;
            int intervalCount;
            public SubsetFinder(int[][] grid)
            {
                init(grid);
            }
            private void init(int[][] grid)
            {
                this.dic = new Dictionary<int, List<int>>();
                this.intervalCount = grid.Length;
                for (int r = 0; r < grid.Length; r++)
                {
                    int[] row = grid[r];
                    if (row.Length != 2) throw new Exception("not grid");
                    int start = row[0];
                    int end = row[1];
                    if (end < start) throw new Exception("bad interval");
                    for (int i = start; i <= end; i++)
                        if (!dic.ContainsKey(i))
                            dic.Add(i, new List<int>(new int[] { r }));
                        else
                            dic[i].Add(r);
                }
            }
            public int GetNumberOfIntervals(int coverageLimit)
            {
                HashSet<int> hsExclude = new HashSet<int>();
                foreach (int key in dic.Keys)
                {
                    List<int> lst = dic[key];
                    if (lst.Count < coverageLimit)
                        foreach (int i in lst)
                            hsExclude.Add(i);
                }
                return intervalCount - hsExclude.Count;
            }
        }

Upvotes: 1

Related Questions