Roman
Roman

Reputation: 389

Enumerator behavior with generator block

I have a bit of a problem understanding how Ruby enumerators deal with generator blocks, in particular with generators of infinite sequences. I'm reading The Well Grounded Rubyist and there's an example that goes like this:

a = [1, 2, 3, 4, 5]

e = Enumerator.new do |y|
  total = 0
  until a.empty?
    total += a.pop
    y << total
  end
end

e.take(2)
=> [5, 9]
a
=> [1, 2, 3]

What I expected it to do is the following: it iterates over the enumerable from start to end, and then returns first two elements of the resulting set, leaving original array empty. But after giving it some thought, I realized that wouldn't work in case of generator producing infinite sequence - the iteration would never end.

Now, I know that when created from external enumerable, Enumerator uses Fibers to stop and resume execution each time underlying Enumerable yields value to the each block (or something like, I know only the general idea). But how does Enumerator work in cases when it was created from constructor and generator block?

I've tried to do some digging in the underlying native code, but quickly got lost in the woods, since my C skills are sub par to say the least. From what I understood internally each is being called on enumerator itself with take_i function provided as a block. I couldn't find any references to fibers though, and wasn't able to dig much deeper.

Upvotes: 2

Views: 293

Answers (2)

Roman
Roman

Reputation: 389

Ok, I think I finally got it. What happens is the following:

Internally, e.take(n) on enumerator is interpreted by Ruby as something like that:

result = []
e.each do |item|
  result << item
  break if result.size == 2
end
result

Returning to the way we initialized the enumerator, the whole block, given with the constructor is the responsibility of instance of Enumerator::Generator class, that was created along with the enumerator itself. The y passed to this block is the instance of Enumerator::Yielder class.

When each is called on enumerator, the generator code is executed first, and when it comes to the y << total part, y passes (yields, as if from method definition, but not quite) this total value to the block, given with the each call. When this block finishes execution and returns, control goes back to the generator code, which performs another loop, and pushes new value to the yielder, which yields it to the each block again etc etc. And when the condition in the each block becomes true, everything stops, thus leaving part of the original array unchanged. So no need to use Fibers in this, just some cool yielding of control back and forth between two blocks of code.

Though it's worth noting that "each block" I was talking about is not an actual ruby block, it's C function take_i, being treated as if it was a block, passed to the each method.

You can read more about this, with diagrams and some extra information here: http://patshaughnessy.net/2013/4/3/ruby-2-0-works-hard-so-you-can-be-lazy

Upvotes: 1

Arthur
Arthur

Reputation: 63

Enumerator is more like a lazy list calculator, allows you to create theoretically infinite lists, but only actually calculate the ones you need when you need, running the loop passed in the constructor block just the amount of times it requires to give you what you asked. Thus your example doesn't empty a until you request 5 things from e.

Ruby's doc example has fibonacci's sequence with a infinite loop: https://ruby-doc.org/core-2.2.0/Enumerator.html#method-c-new.

Upvotes: 0

Related Questions