Kevin Kredit
Kevin Kredit

Reputation: 43

Collapse array-building nested each loops into single reduce in Ruby

I'm trying to minimize procedural code in Ruby. I have the following function build_array. It procedurally builds up a array in a pair of nested each loops.

  def build_array(max)
    array = []
    (1..max).each do |size|
      (0..size).each do |n|
        array << [n, size - n]
      end
    end
    array 
  end

The function works correctly and produces output like this:

> build_array 2
 => [[0, 1], [1, 0], [0, 2], [1, 1], [2, 0]]
> build_array 3
 => [[0, 1], [1, 0], [0, 2], [1, 1], [2, 0], [0, 3], [1, 2], [2, 1], [3, 0]]

Can this function be further optimized with a reduce somehow? The building up of a new object based on the result of a series of operations seems to fit that pattern, but I can't figure it out. The order of the subarrays is not important.

Thanks!

Upvotes: 2

Views: 699

Answers (3)

J&#246;rg W Mittag
J&#246;rg W Mittag

Reputation: 369526

Can this function be further optimized with a reduce somehow?

I don't know about "optimized", but yes, you definitely can use reduce. And I didn't even read your code before I answered!

There is an interesting reason why I was able to give that answer without even looking at your question, and that is that reduce is a general iteration operator. Now, what do I mean by that?

If you think about what reduce does, on a deep level, here is how it works. A collection can either be empty or not empty. Those are the only two cases. And the reduce operator takes three arguments:

  • The collection being reduced,
  • What to do when the collection is not empty, and
  • What to do when the collection is empty.

Now, you might say "three arguments, I only see one", but that's because we are using Ruby:

  • The first argument is the hidden self argument, i.e. the collection that you are calling reduce on,
  • The second argument is the block, and
  • The third argument is the actual argument that Ruby's Enumerable#inject takes.

Now, typically we think about, and call the third argument the "accumulator". But in some other languages, it is also called the "zero", and there you can more easily see the connection to processing an empty collection.

But think about it: when you iterator over a collection, you generally have to deal with the fact that the collection is empty. And you have to somehow produce the next step in case the collection is not empty. There is really nothing else you need to do when iterating.

But those two things are exactly what is being covered by the last two arguments of the reduce operator!

So, it turns out that reduce can do everything that iteration can do. Really, everything. Everything that can be done with for / in or each in Ruby, foreach in C#, for / in, for / of, and forEach in JavaScript, the enhanced for loop in Java and C++, etc. can be done with reduce.

That means that every method that exists in Enumerable can be done with reduce.

Another way to think about it, is that a collection is a program written in a language that has only two commands: ELEMENT(x) and STOP. And reduce is an interpreter for that programming language that allows you to customize and provide your own implementations of ELEMENT(x) and STOP.

That is why the answer to the question "can this iteration thing be done with reduce" is always "Yes", regardless of what the "thing" that you're doing actually is.

So, back to your question, here is what a naive, blind, mechanical 1:1 translation of your code to reduce would look like:

def build_array(max)
  (1..max).reduce([]) do |acc, size|
    (0..size).reduce(acc) do |acc, n|
      acc << [n, size - n]
    end
  end
end

Whenever you have a code pattern of the form:

  • Create some object
  • Iterate over a collection and add to that object in each iteration
  • Return the object

That is exactly reduce: the object that you are creating is the accumulator and your loop body is the reducing operation. You can see that here: Comparison of before and after code.

Note that in general reduce is considered to be part of functional programming, and we are actually mutating the accumulator here. It would be more "pure" to instead return a new accumulator in each iteration:

def build_array(max)
  (1..max).reduce([]) do |acc, size|
    (0..size).reduce(acc) do |acc, n|
      acc + [[n, size - n]]
    end
  end
end

Alternatively, Ruby also has an "impure" version of reduce called each_with_object that is specifically designed for mutating the accumulator. In particular, reduce uses the return value of the block as the accumulator object for the next iteration, whereas each_with_object always passes the same object and just ignores the return value of the block:

def build_array(max)
  (1..max).each_with_object([]) do |size, array|
    (0..size).each_with_object(array) do |n, array|
      array << [n, size - n]
    end
  end
end

However, note that just because everything can be expressed as a reduce operation, not everything should be expressed as a reduce operation. There are several more specific operations, and if possible, those should be used.

In particular, in our case we are actually mostly transforming values, and that's what map is for. Or, flat_map if you want to process a nested collection into a non-nested one:

def build_array(max)
  (1..max).flat_map do |size|
    (0..size).map do |n|
      [n, size - n]
    end
  end
end

However, the most elegant solution in my opinion is the recursive one:

def build_array(max)
  return [] if max.zero?

  build_array(max.pred) + (0..max).map do |n|
    [n, max - n]
  end
end

However, note that this solution might be blow the stack for high values of max. Although that may or may not be a problem in your case because the result array also grows pretty large very quickly, so at the point where you run out of stack space, you pretty much also would run out of RAM, even with a non-recursive solution. On my laptop with YARV, I was able to determine that it breaks somewhere between 10000 and 20000, at which point the array uses well over 1GB.

The solution for this is to use a lazy infinite stream that generates each element one-by-one as it is consumed:

build_array = Enumerator.new do |y|
  (1..).each do |size|
    (0..size).each do |n|
      y << [n, size - n]
    end
  end
end

loop do
  print build_array.next
end
# [0, 1][1, 0][0, 2][1, 1][2, 0][0, 3][1, 2][2, 1][3, 0][0, 4][1, 3][2, 2]
# [3, 1][4, 0][0, 5][1, 4][2, 3][3, 2][4, 1][5, 0][0, 6][1, 5][2, 4][3, 3]
# [4, 2][5, 1][6, 0][0, 7][1, 6][2, 5][3, 4][4, 3][5, 2][6, 1][7, 0][0, 8]
# [1, 7][2, 6][3, 5][4, 4][5, 3][6, 2][7, 1][8, 0][0, 9][1, 8][2, 7][3, 6]
# [4, 5][5, 4][6, 3][7, 2][8, 1][9, 0][0, 10][1, 9][2, 8][3, 7][4, 6][5, 5]
# [6, 4][7, 3][8, 2][9, 1][10, 0][0, 11][1, 10][2, 9][3, 8][4, 7][5, 6][6, 5]
# [7, 4][8, 3][9, 2][10, 1][11, 0][0, 12][1, 11][2, 10][3, 9][4, 8][5, 7]
# [6, 6][7, 5][8, 4][9, 3][10, 2][11, 1][12, 0][0, 13][1, 12][2, 11]
# [3, 10][4, 9][5, 8][6, 7][7, 6][8, 5][9, 4][10, 3][11, 2][12, 1][13, 0]
# [0, 14][1, 13][2, 12][3, 11][4, 10][5, 9][6, 8][7, 7][8, 6][9, 5][10, 4]
# [11, 3][12, 2][13, 1][14, 0][0, 15][1, 14][2, 13][3, 12][4, 11][5, 10]
# [6, 9][7, 8][8, 7][9, 6][10, 5][11, 4][12, 3][13, 2][14, 1][15, 0]
# [0, 16][1, 15][2, 14][3, 13][4, 12][5, 11][6, 10][7, 9][8, 8][9, 7][10, 6]
# [11, 5][12, 4][13, 3][14, 2][15, 1][16, 0][0, 17][1, 16][2, 15][3, 14]
# [4, 13][5, 12][6, 11][7, 10][8, 9][9, 8][10, 7][11, 6][12, 5][13, 4][14, 3]
# [15, 2][16, 1][17, 0][0, 18][1, 17][2, 16][3, 15][4, 14][5, 13][6, 12]
# [7, 11][8, 10][9, 9][10, 8] …

You can let this run forever, it will never use any more memory. On my laptop, the Ruby process never grows beyond 9MB, where it was using multiple GB previously.

Upvotes: 3

Oleksandr Holubenko
Oleksandr Holubenko

Reputation: 4440

I think this question is mostly for Code Review

But let's try to help you: Here is example with Enumerable#inject and Array#new

def new_build_array(max)
  (1..max).inject([]) do |array, size|
    array += Array.new(size+1) { |e| [e, size-e] }
  end
end

> new_build_array 2
=> [[0, 1], [1, 0], [0, 2], [1, 1], [2, 0]]
> new_build_array 3
=> [[0, 1], [1, 0], [0, 2], [1, 1], [2, 0], [0, 3], [1, 2], [2, 1], [3, 0]]

Upvotes: 2

SteveTurczyn
SteveTurczyn

Reputation: 36860

You can implement this with map instead of each ... essentially the same code but you don't need to shovel it into an array as map returns an array for you.

def build_array(max)
  (1..max).map{|s|(0..s).map{|n|[n,s-n]}}.flatten(1)
end

reduce or the alias inject would require you to still insert into an array, so I can't see it would be better than your original method. I'd only use reduce if I wanted to return an object that consolidates the input array into a smaller collection.

Upvotes: 1

Related Questions