Konstantin
Konstantin

Reputation: 3123

Finding the longest growing sequence in an array of arrays with Ruby

I have an array of arrays which contains integers. For example:

arr=[[109, 160, 184, 229],
[45, 67, 158, 175, 201, 250, 273],
[33, 86, 89, 182, 245, 251, 254, 272],
[35, 76, 93, 143, 222, 267],
[189, 242],
[19],
[41, 58, 135, 256],
[59],
[60, 138, 183, 203, 246],
[45, 67, 158, 175, 197, 201, 250, 273],
[55, 57, 101, 103, 193, 212, 231, 257],
[18, 23, 51, 75, 106, 139, 179, 247],
[31, 72, 92, 99, 148, 230],
[128, 142, 151, 164, 170, 173, 196, 226],
[15],
[4],
[41, 113, 135, 256],
[33, 251]]

Look at the whole example array here http://pastebin.com/exzi8Mnq

Every nested array is sorted, and contains uniqe elements, no duplicates, they contains at least one element, at most there are no limit (usually maximum 10-15 elements) and 200-2000 nested arrays in total.

I would like to find the longest sequence of the integers across the arrays which elements monotonously grows, without equality. For example one element from the first array, the second is from the second array, etc. Every nested array should provide only one element or zero. Maybe the first element doesn't origin from the first array, but the second, or third, or more, to ensure finally the longest growing sequence. In top of that some of the nested arrays could be skipped, I mean they don't provide any element to the result sequence.

You can imagine a looong number in a numeral system which radix is equal to the size of the largest nested array and the digits may refer to the the nested arrays, so we can count all the possible sequences. For example 0123 would represent that sequence that first element is chosen from the second array, and its index is 0 (1-1). Second element is chosen from the 3rd array, and its index is 2 (3-1) First digit 0 represents that no integer is chosen from the first array. The allowed maximal value of a digit is limited by the size of the given nested array.

Example output: [45, 86, 93, 189] First element was chosen from the second array, second from the third array, etc. However this is obviously not the longest sequence which can be extracted.

I think some kind of backtracking or creating the product of all array to an enumerator and examine the result.

I need this method for my subtitle timing program.

Upvotes: 1

Views: 1395

Answers (3)

Cary Swoveland
Cary Swoveland

Reputation: 110735

Approach

We can solve this problem using dynamic programming, where:

  • the stages are the subarrays;
  • the state variable is the largest element of a sequence;
  • the function to be maximized, at each stage i, 0 <= i < arr.size, for each n, 0 <= n <= largest (where largest is the largest value among all subarrays: arr.flatten.max), is to maximimize the length of the longest sequence taken from arr[0],...arr[i], such that the largest (last) value of the sequence is n.

Unfamiliar with "dynamic programming"? If so, not a problem, as here the optimization technique is straightforward and intuitive.

We will examine each subarray arr[j], j=0..arr.size-1 in turn. For each such j and each n, 0 <= n <= largest (largest = arr.flatten.max), we define the array longest[n] to be the longest sequence among all sequences drawn from subarrays arr[0], arr[1],...arr[j] whose largest (last) value is at most n. The optimal solution is given by longest[largest] after all subarrays have been examined. Yes, that's a mouthful, but if you want to understand the algorithm I'm using, you need to understand what I've just said.

Code

def find_longest_sequence(arr)
  largest = arr.map(&:last).max
  longest = Array.new(largest+1,[])
  arr.each do |a|
    new_longest = a.each_with_object([]) { |n,new_longest|
      # See if, by using n from a, we have found a largest sequence ending in n
      (new_longest << [n,longest[n-1]+[n]]) if
        longest[n].size <= longest[n-1].size }
    # Update longest for each pair in new_longest, starting from largest n
    until new_longest.empty?
      i,seq = new_longest.pop
      len = seq.size
      (i..largest).each { |j|
        (len > longest[j].size) ? (longest[j] = seq) : break }
    end
  end
  longest[-1]
end

Examples

arr = [[3, 4], [2, 5], [1, 4], [4, 5]]
find_longest_sequence(arr) #=> [3, 4, 5]

For your array arr above:

find_longest_sequence(arr)
  #=> [33, 35, 58, 59, 60, 67, 103, 139, 148, 226, 256]

For your array arr at the link you give:

find_longest_sequence(arr)
  #=> [ 33,  35,  58,  59,  60,  67,  75,  92,  97, 104, 105,
  #    108, 109, 115, 116, 125, 127, 132, 135, 149, 163, 170,
  #    173, 174, 176, 177, 178, 183, 184, 187, 198, 199, 200,
  #    218, 222, 230, 234, 241, 247, 259, 265, 266, 267, 269]
find_longest_sequence(arr).size #=> 44

Explanation

Perhaps the best way to explain how the algorithm works is to run it with some debugging statements.

def find_longest_sequence(arr)
  largest = arr.map(&:last).max
  longest = Array.new(largest+1,[])
  puts "largest = #{largest}, longest.size = #{longest.size}"
  arr.each do |a|
    new_longest = a.each_with_object([]) { |n,new_longest|
      # See if, by using n from a, we have found a largest sequence ending in n
      (new_longest << [n,longest[n-1]+[n]]) if
        longest[n].size <= longest[n-1].size }
      puts "  new_longest = #{new_longest}"
    # Update longest for each pair in new_longest, starting from largest n
    until new_longest.empty?
      i,seq = new_longest.pop
      len = seq.size
      puts "    i=#{i}, seq=#{seq}, len =#{len}, new_longest=#{new_longest}"
      (i..largest).each { |j| (len > longest[j].size) ?
        (puts "j=#{j}"; longest[j] = seq) : break }
      puts "    longest=#{longest}"
    end
  end
  longest[-1]
end

arr = [[3, 4], [2, 5], [1, 4], [4, 5]]
find_longest_sequence(arr) #=> [3, 4, 5]

[[3, 4], [2, 5], [1, 4], [4, 5]]
largest = 5, longest.size = 6
  new_longest = [[3, [3]], [4, [4]]]
    i=4, seq=[4], len =1, new_longest=[[3, [3]]]
j=4
j=5
    longest=[[], [], [], [], [4], [4]]
    i=3, seq=[3], len =1, new_longest=[]
j=3
    longest=[[], [], [], [3], [4], [4]]
  new_longest = [[2, [2]], [5, [4, 5]]]
    i=5, seq=[4, 5], len =2, new_longest=[[2, [2]]]
j=5
    longest=[[], [], [], [3], [4], [4, 5]]
    i=2, seq=[2], len =1, new_longest=[]
j=2
    longest=[[], [], [2], [3], [4], [4, 5]]
  new_longest = [[1, [1]], [4, [3, 4]]]
    i=4, seq=[3, 4], len =2, new_longest=[[1, [1]]]
j=4
    longest=[[], [], [2], [3], [3, 4], [4, 5]]
    i=1, seq=[1], len =1, new_longest=[]
j=1
    longest=[[], [1], [2], [3], [3, 4], [4, 5]]
  new_longest = [[5, [3, 4, 5]]]
    i=5, seq=[3, 4, 5], len =3, new_longest=[]
j=5
    longest=[[], [1], [2], [3], [3, 4], [3, 4, 5]]
[3, 4, 5]

Upvotes: 1

Henry Chinner
Henry Chinner

Reputation: 429

Not that easy, in some cases an array can be skipped to give better results. Here is a naive implementation anyway that greedily takes the next available bigger number.

[Edit]: The Algorithm now runs through all possible permutations of skipping arrays when selecting the next bigger number in the sequence. The algo is now guaranteed to return an optimal solution.

require 'pp'

@arr=[[109, 160, 184, 229],[45, 67, 158, 175, 201, 250, 273],[33, 86, 89, 182, 245, 251, 254, 272],[35, 76, 93, 143, 222, 267],[189, 242],[19],[41, 58, 135, 256],[59],[60, 138, 183, 203, 246],[45, 67, 158, 175, 197, 201, 250, 273],[55, 57, 101, 103, 193, 212, 231, 257],[18, 23, 51, 75, 106, 139, 179, 247],[31, 72, 92, 99, 148, 230],[128, 142, 151, 164, 170, 173, 196, 226],[15],[4],[41, 113, 135, 256],[33, 251]]

@result_set =[]
reverse_ary = []

@perms = [1, 0].repeated_permutation(@arr.size).to_a

def iterate_large_ary(perm)
    seq = []
    @arr.each_with_index do |a,i| 
        if perm[i] != 0
            if seq.length == 0
                seq << a[0]
            else
                temp = find_next_bigger_number(a,seq[-1])
                if temp != nil
                 seq << temp
                end
            end
        else
            next
        end         
    end 
    @result_set << [seq,seq.length]
end 

def find_next_bigger_number(ary,num)
    temp = nil
    ary.each do |el|

        if el > num
            temp = el
            break
        end 
    end
    return temp
end 

@perms.each{|perm| iterate_large_ary(perm)}

pp @result_set.sort!{|a,b| b[1] <=> a[1]}[0][0]

Upvotes: 0

konsolebox
konsolebox

Reputation: 75578

And so this was fun.

#!/usr/bin/env ruby

a = [ [109, 160, 184, 229], [45, 67, 158, 175, 201, 250, 273], [33, 86, 89, 182, 245, 251, 254, 272], [35, 76, 93, 143, 222, 267], [189, 242], [19], [41, 58, 135, 256], [59], [60, 138, 183, 203, 246], [45, 67, 158, 175, 197, 201, 250, 273], [55, 57, 101, 103, 193, 212, 231, 257], [18, 23, 51, 75, 106, 139, 179, 247], [31, 72, 92, 99, 148, 230], [128, 142, 151, 164, 170, 173, 196, 226], [15], [4], [41, 113, 135, 256], [33, 251], [25, 69, 84, 97, 133, 171, 204, 248, 252, 258, 268, 269], [25, 69, 96, 133, 171, 176, 194, 204, 252, 258, 268], [17, 29, 53, 61, 102, 104, 123, 127, 129, 145, 146, 178, 188, 233, 265], [6, 13, 39, 71, 98, 105, 185, 234, 235], [86, 89, 181, 182, 245, 254, 272], [50, 108, 110, 222], [55, 101, 103, 169, 191, 193, 205, 212, 231], [56, 83, 134, 138, 246], [109, 160, 184, 229], [208], [60, 138, 183, 203, 246], [45, 67, 158, 175, 201, 250, 273], [266], [161], [100, 228], [38, 82, 115, 180, 255, 260], [116], [13, 57, 98, 105, 185, 235, 257], [113, 135], [30, 131, 202, 241, 271], [266], [46, 52, 198, 209, 232], [125, 130, 154], [161], [92, 99, 148, 230], [12, 37, 49, 80, 94, 151, 164, 226], [26, 102, 126, 127, 145, 236, 261], [20, 37, 80, 132, 219, 259], [95, 227], [113, 135], [38, 82, 115, 180, 255, 260], [149], [263], [163], [17, 53, 61, 104, 128, 129, 142, 170, 173, 196, 233, 265], [120, 220], [31, 72, 92, 99, 148, 230], [56, 83, 134, 138, 246], [4, 109], [128, 142, 151, 164, 170, 173, 196, 226], [46, 52, 174, 199, 209], [120, 220], [25, 69, 96, 133, 171, 176, 194, 204, 252, 258, 268], [177], [17, 29, 53, 61, 104, 123, 128, 129, 142, 146, 178, 188, 196, 233, 265], [47, 107], [60, 161, 183, 203], [56, 83, 134, 138, 246], [100, 109, 160, 184, 228, 229], [174, 199], [187], [3, 11, 62, 153, 165, 216], [18, 20, 139, 247, 259], [9, 21, 74, 157], [54, 85, 210, 211], [25, 69, 84, 97, 133, 171, 204, 248, 252, 258, 268, 269], [41, 58, 135, 256], [54, 85, 210], [198, 232], [46, 52, 174, 199, 209], [48, 119, 200], [45, 120, 197, 250, 273], [1], [198, 232], [47, 107], [35, 76, 93, 143, 222, 267], [218], [13, 55, 57, 98, 101, 193, 212, 231, 235, 257], [4, 109], [41, 58, 256], [31, 92, 99, 148, 230], [35, 76, 93, 143, 222, 267], [35, 93, 124], [31, 72, 92, 99, 148, 230], [243], [12, 49, 94, 151, 164, 170, 173, 226], [18, 23, 51, 75, 106, 139, 179, 247], [6, 39, 71, 75, 105, 106, 185, 234], [30, 131, 202, 241, 271], [18, 20, 139, 247, 259], [20, 37, 80, 132, 219, 259], [35, 76, 93, 143, 222, 267], [31, 72, 92, 99, 148, 230], [30, 131, 202, 241, 271], [46, 52, 199, 209], [46, 52, 198, 209, 232], [17, 53, 61, 104, 128, 129, 142, 170, 173, 196, 233, 265], [125, 266], [31, 72, 84, 97, 248, 269], [227], [33, 86, 182, 245, 251, 254], [35, 93, 124], [76, 108, 143, 222, 267], [17, 53, 61, 104, 128, 129, 142, 170, 173, 196, 233, 265], [31, 72, 84, 97, 248, 269] ]

def find_longest_seq(a, l = [[]], i = nil, s = nil)
  s ||= a.size - 1
  if (set = a[s])
    if set.size > 0
      set.sort!
      recursed = false
      while (e = set.last)
        if i.nil? or e < i
          l, r = find_longest_seq(a, l, e, s - 1)
          r = r ? r + [e] : [e]
          if r.size == l.first.size
            l << r
          elsif r.size > l.first.size
            l = [r]
          end
          return [l, r]
        elsif not recursed
          l, r = find_longest_seq(a, l, e, s - 1)
          r = r ? r + [e] : [e]
          if r.size == l.first.size
            l << r
          elsif r.size > l.first.size
            l = [r]
          end
          recursed = true
        end
        set.pop
      end
    end
  end
  [l, nil]
end

l, r = find_longest_seq(a.map(&:clone))
l.each { |e| puts e.inspect }

.sort! is optional if data is already sorted.

Output:

[20, 132, 143, 148, 202, 209, 232, 265, 266, 269]
[18, 80, 93, 99, 131, 209, 232, 265, 266, 269]

Upvotes: 1

Related Questions