punitcse
punitcse

Reputation: 727

How can I get the count of overlapping ranges in sorted order?

Suppose I have an array of sorted inclusive ranges:

a = [1012..1014, 1016..1020, 1017..1022, 1021..1035, 1040..1080]

I want as output an array of arrays, each of whose first element is a range and second element its overlapping count, like this:

[[1012..1014, 1], [1016..1016, 1], [1017..1020, 2], [1021..1022, 2], [1023..1035, 1], [1040..1080, 1]]

For example, the range 1017..1020 is included in two ranges 1016..1020 and 1017..1022, so its count would be two.

Upvotes: 5

Views: 291

Answers (5)

Cary Swoveland
Cary Swoveland

Reputation: 110675

Code

require 'set'

def range_info(a)
  covered_by = a.each_with_object(Hash.new { |h,k| h[k]=Set.new }) { |r,h|
    r.each { |n| h[n] << r } }
  a.flat_map { |r| r.to_a }.
    uniq.
    slice_when { |b,c| c > b+1 }.
    flat_map { |r| r.to_a.slice_when { |b,c| covered_by[b] != covered_by[c] } }.
    flat_map { |enum| enum.to_a.map { |a| [a.first..a.last, covered_by[a.first].size] } }
end

Example

a = [1012..1014, 1016..1020, 1017..1022, 1021..1035, 1040..1080]

range_info(a)
  #=> [[1012..1014, 1], [1016..1016, 1], [1017..1020, 2], [1021..1022, 2],
  #    [1023..1035, 1], [1040..1080, 1]] 

Explanation

First create the hash covered_by with keys equal to numbers that are covered by at least one range in a, where covered_by[n] equals the set of all ranges in a that cover key n:

covered_by = a.each_with_object(Hash.new { |h,k| h[k]=Set.new }) { |r,h|
  r.each { |n| h[n] << r } }
  #=> {1012=>#<Set: {1012..1014}>, 1013=>#<Set: {1012..1014}>,
  #    ...
  #    1016=>#<Set: {1016..1020}>, 1017=>#<Set: {1016..1020, 1017..1022}>,
  #    ...
  #    1079=>#<Set: {1040..1080}>, 1080=>#<Set: {1040..1080}>} 

See my answer here for an explanation of Hash.new { |h,k| h[k]=[] }, which is similar to Hash.new { |h,k| h[k]=Set.new }.

Next, obtain an array of increasing non-overlapping ranges that cover the same numbers that are covered by one or more ranges in a:

arr = a.flat_map { |r| r.to_a }.uniq.slice_when { |b,c| c > b+1 }
  #=> [1012..1014, 1016..1035, 1040..1080] 

Next, break each of the ranges in arr into enumerators that will generate arrays of consecutive numbers that are covered by the same ranges in a:

b = arr.flat_map { |r| r.to_a.slice_when { |b,c| covered_by[b] != covered_by[c] } }
  #=> [#<Enumerator: #<Enumerator::Generator:0x007fd1ea854558>:each>,
  #    #<Enumerator: #<Enumerator::Generator:0x007fd1ea8543c8>:each>,
  #    #<Enumerator: #<Enumerator::Generator:0x007fd1ea854238>:each>] 

We can see the elements of b by converting them to arrays:

b.map(&:to_a)
  #=> [[[1012, 1013, 1014]],
  #    [[1016], [1017, 1018, 1019, 1020], [1021, 1022],
  #     [1023, 1024, 1025, 1026, 1027, 1028, 1029, 1030, 1031, 1032, 1033,
  #      1034, 1035]],
  #    [[1040, 1041, 1042, 1043, 1044, 1045, 1046, 1047, 1048, 1049, 1050,
  #      1051, 1052, 1053, 1054, 1055, 1056, 1057, 1058, 1059, 1060, 1061,
  #      1062, 1063, 1064, 1065, 1066, 1067, 1068, 1069, 1070, 1071, 1072,
  #      1073, 1074, 1075, 1076, 1077, 1078, 1079, 1080]]] 

Lastly, flat_map these arrays to arrays containing a range and the number of ranges in a that cover all the elements of the range:

c = b.flat_map { |enum| enum.to_a.map { |a| [a.first..a.last, covered_by[a.first].size] } }
  #=> [[1012..1014, 1], [1016..1016, 1], [1017..1020, 2], [1021..1022, 2],
  #    [1023..1035, 1], [1040..1080, 1]] 

Upvotes: 2

sawa
sawa

Reputation: 168091

The following solution works under limited occasions: when the minimum value of a range and the maximum value of a range are never identical. (I.e., if there is x..100, then there is no 100..y. Also there is no z..z.)

break_points = a.flat_map{|r| [r.min - 1, r.min, r.max, r.max + 1]}.uniq.sort
a.flat_map do
  |r|
  break_points
  .select{|i| r.min <= i and i <= r.max}
  .each_slice(2)
  .map{|min, max| min..max}
end
.group_by(&:itself)
.map{|k, v| [k, v.length]}

Upvotes: 0

Van Huy
Van Huy

Reputation: 1627

This is my approach to solve your problem. Let

a = [1012..1014, 1016..1020, 1017..1022, 1021..1035, 1040..1080]

Step 1: Flatten this array, then count each element

b = a.map(&:to_a).inject(:+).sort.group_by{|i| i }.map{|k,v| [k,v.count] }

# => [[1012, 1], [1013, 1], [1014, 1], [1016, 1], [1017, 2], [1018, 2], [1019, 2], [1020, 2], [1021, 2], [1022, 2], [1023, 1], ...

Step 2: Add nil as break points

c = b.each_with_index do |e, i| 
  if e.nil? || b[i+1].nil? then next end
  if b[i][0] + 1  != b[i+1][0] || b[i][1] != b[i+1][1] then b.insert(i+1,nil) end
end 

# => [[1012, 1], [1013, 1], [1014, 1], nil, [1016, 1], nil, [1017, 2], [1018, 2], [1019, 2], [1020, 2], [1021, 2], [1022, 2], nil, [1023, 1], ...

Step 3: Split obtained array by break points and group them to ranges

d = c.split{|e| e.nil?}.map{|e| [(e.first[0]..e.last[0]), e.first[1]]}

# => [[1012..1014, 1], [1016..1016, 1], [1017..1022, 2], [1023..1035, 1], [1040..1080, 1]]

Update:

Since split is a method from Rails, so I have an alternative with pure Ruby.

Step 1: same as above

Step 2: Split the array into small groups as below

c = []
j = 0
b.each_with_index do |e, i| 
  if c[j].nil? then c[j] =[] end
  c[j] << b[i]
  if b[i+1] && (b[i][0] + 1 != b[i+1][0] || b[i][1] != b[i+1][1]) then j+=1 end
end

# p c => [
#          [[[1012, 1], [1013, 1], [1014, 1]], 
#          [[1016, 1]], 
#          [[1017, 2], [1018, 2], [1019, 2], [1020, 2], [1021, 2], [1022, 2]], 
#           ...
#        ]

Step 3: Convert each group to a range

d = c.map{|e| [(e.first[0]..e.last[0]), e.first[1]]}

# => [[1012..1014, 1], [1016..1016, 1], [1017..1022, 2], [1023..1035, 1], [1040..1080, 1]]

Upvotes: 0

theo
theo

Reputation: 46

If you are looking to test all subranges from the supplied ranges, you could try something like this (only accounts for subranges starting from the min value of each original range):

a = [1012..1014, 1016..1020, 1017..1022, 1021..1035, 1040..1080]
test_inputs = a.each_with_object([]) do |original, expanded| 
  original.size.times.each{ |i| expanded << Range.new(original.min, original.min+i) }
end

output = test_inputs.each_with_object([]) do |input, result|
  appears = a.select{|x| x.min <= input.min}.select{|x| x.max >= input.max}.count
  result << [input, appears]
end

Upvotes: 0

Wand Maker
Wand Maker

Reputation: 18762

Here is my take on this problem. It may not be efficient - complexity O(n2) - nonetheless, it is a solution.

My approach to find whether a range is sub-range of another range is to do following steps:

  1. Convert both ranges to array and join then using Array#| operator
  2. Sort the array obtained by combining the two ranges.
  3. If one range is sub-range of another, then, the range that includes the sub-range will be equal to combined sorted array when converted to array using to_a.

Here is an illustration:

r1 = 2..3
r2 = 1..4

p a = r1.to_a | r2.to_a
#=> [2, 3, 1, 4]
p a = a.sort
#=> [1, 2, 3, 4]
p a == r1.to_a
#=> [1,2,3,4] == [2,3] 
#=> false
p a == r2.to_a
#=> [1,2,3,4] == [1,2,3,4]
#=> true

Based on the above approach, here is the complete code. Although I am not sure that example list of ranges given in the question has any overlapping ranges, hence, I have taken example of my own.

h = {}
r_a = [1016..1020, 1017..1020, 1021..1035, 1040..1080]
r_a.each {|r| h[r] = 1}

(0...r_a.length).each do |i|
    (0...r_a.length).each do |j|
        if (i != j)
            range_outer = r_a[i]
            range_inner = r_a[j]

            first,*rest,last = (range_outer.to_a | range_inner.to_a).to_a.sort
            combined_range = Range.new(first, last)

            if range_inner == combined_range
                h[range_outer] += 1 
            end
        end
    end
end
p h
#=> {1016..1020=>1, 1017..1020=>2, 1021..1035=>1, 1040..1080=>1}

Upvotes: 1

Related Questions