Shelvacu
Shelvacu

Reputation: 4380

How to auto save state of long iteration in ruby

Say I have a program with a runtime in the order of weeks with a structure like this:

(1..1000).each do |number|
  ('a'..'z').each do |letter|
    %w(alpha beta omega whatever foo bar).each do |word|
      do_long_running_calculation(number,letter,word)
    end
  end
end

Since the machine running the program may have a sudden unexpected halt, I'd like to save the index that it was on for each array to a file, such that it can re-start from where it left off instead of starting from the beginning in case of a sudden program abort.

Ultimately if this doesn't yet exist as a library (or easy solution that has evaded me), I'm going to make it myself and post it as an answer, but I would like to avoid re-inventing the wheel

Upvotes: 2

Views: 209

Answers (2)

Cary Swoveland
Cary Swoveland

Reputation: 110755

Here's a way to deal with your specific example, which could be generalized.

First, two helpers:

class Range
  def size # overwrite
    case first
    when Fixnum
      last - first + 1
    when String
      last.ord - first.ord + 1
    end
  end

  def [](offset)
    case first
    when Fixnum
      first + offset
    when String
      (first.ord + offset).chr
    end
  end
end

For example:

(1..10).size    #=> 10 
('a'..'z').size #=> 26 
(1..10)[4]      #=> 5 
('a'..'z')[4]   #=> "e" 

For the example:

loops = [(1..1000), ('a'..'z'), %w(alpha beta omega whatever foo bar)]
loop_sizes = loops.map(&:size)
  #=> [1000, 26, 6]
prod = 1
tot_nbr_loops, *prod_loop_sizes = (loop_sizes + [1]).reverse.
  map { |n| prod = n*prod }.reverse
  #=> [156000, 156, 6, 1] 
tot_nbr_loops
  #=> 156000 
prod_loop_sizes
  #=> [156, 6, 1] 

Using these objects we can create a method that maps a sequence of integers into the triples that are to be enumerated:

def elements(loops, prod_loop_sizes, offset)
  loops.zip(prod_loop_sizes).map do |loop, prod|
  div, offset = offset.divmod(prod)
  loop[div]
  end
end

Let's try it:

elements(loops, prod_loop_sizes, 0)      #=> [1, "a", "alpha"] 
elements(loops, prod_loop_sizes, 5)      #=> [1, "a", "bar"] 
elements(loops, prod_loop_sizes, 6)      #=> [1, "b", "alpha"] 
elements(loops, prod_loop_sizes, 155)    #=> [1, "z", "bar"] 
elements(loops, prod_loop_sizes, 156)    #=> [2, "a", "alpha"] 
elements(loops, prod_loop_sizes, 156)    #=> [2, "a", "alpha"] 
elements(loops, prod_loop_sizes, 156)    #=> [2, "a", "alpha"] 
elements(loops, prod_loop_sizes, 155999) #=> [1000, "z", "bar"] 

So now you could write something like this:

save_interval = 10_000

total_number_loops.times do |i|
  a,b,c = elements(loops, prod_loop_sizes, i) 
  # perform calculations with a,b,c
  if i % save_interval == 0
     <save the value of i and the current state>
     <delete the previous saved state>
  end
end

One easy way to save to (retrieve from) file most Ruby objects is to use the method Marshal#dump (Marshal#load). (Note the Marshal file format is not guaranteed to remain the same from one Ruby version to the next.)

Upvotes: 1

Kh Ammad
Kh Ammad

Reputation: 1085

Ruby provide us the trap method to handle the OS generated SIGNALS. When the system shutdown, then it send the SIGTERM signal. So all we need to handle that signal.

trap("TERM") do
  write_data_to_file
end

def write_data_to_file
  # code to save data in the file,
end

To read/write from file, cool answer is here: https://stackoverflow.com/a/4310299/4136098

Upvotes: 0

Related Questions