Sim
Sim

Reputation: 13528

Ruby Enumerator-based lazy flatten method

Michael Harrison has a great post on lazy enumerators in Ruby, providing an implementation of lazy_select and lazy_map. I am wondering whether the following implementation of lazy_flatten should have special processing for anything other than Enumerator and Enumerable types.

class Enumerator

  def lazy_flatten
    Enumerator.new do |yielder|
      self.each do |value|
        if value.kind_of? Enumerator
          value.lazy_flatten.each do |v|
            yielder.yield v
          end
        elsif value.kind_of? Enumerable
          value.flatten.each do |v|
            yielder.yield v
          end
        else
          yielder.yield value
        end
      end
    end
  end

end

Upvotes: 2

Views: 1445

Answers (2)

David Moles
David Moles

Reputation: 51123

Note that in Ruby 2.0+, you don't need to do this, you can just use Enumerable#lazy, which returns an Enumerator::Lazy.

For reasons that aren't clear to me, Lazy doesn't have flatten, but it has flat_map, so in principle you could just use flat_map with the identity function.

module Enumerable
  def lazy_flatten
    self.lazy.flat_map { |x| x }
  end
end

Lazy#flat_map mostly takes care of decomposing any decomposable elements, but not quite -- from the docs:

A value x returned by block is decomposed if either of the following conditions is true:

  1. x responds to both each and force, which means that x is a lazy enumerator.
  2. x is an array or responds to to_ary.

Note that to_ary is not a method on Enumerable, presumably to discourage implicit conversions from infinite sequences to arrays. This means, for instance, that if you try to lazy_flatten something that contains a Set or a Range with the above code, it (arguaby, see below) won't work:

a = [[1, 2, 3], Set[4, 5], 6, 7..8]
# => [[1, 2, 3], #<Set: {4, 5}>, 6, 7..8] 
f = a.lazy_flatten
# => #<Enumerator::Lazy: #<Enumerator::Lazy: [[1, 2, 3], #<Set: {4, 5}>, 6, 7..8]>:flat_map>  
f.to_a
# => [1, 2, 3, #<Set: {4, 5}>, 6, 7..8]

However, this is the same as the behavior of Array#flatten:

a.flatten
# => [1, 2, 3, #<Set: {4, 5}>, 6, 7..8]

(Although Array#flatten will not detect and decompose lazy enumerators, and Lazy#flat_map will.)

Whereas the OP's code or the code in Mladen Jablanović's answer will decompose the Set and the Range:

f = a.lazy_flatten # (M.J.'s code)
# => #<Enumerator: #<Enumerator::Generator:0x00007fd819c166c0>:each>
f.to_a
# => [1, 2, 3, 4, 5, 6, 7, 8]

That code, however, will also iterate infinitely if passed something that includes an infinite sequence:

a = [[1, 2, 3], Set[4, 5], 6, 7..8, 9..Float::INFINITY]
# => [[1, 2, 3], #<Set: {4, 5}>, 6, 7..8, 9..Infinity]
f = a.lazy_flatten # (M.J.'s code)
# => #<Enumerator: #<Enumerator::Generator:0x00007fd819a73d18>:each>
f.to_a
# => spins at 100% CPU for a while and eventually runs out of memory

If you consider that a feature, rather than a bug, one approach would be to modify the flat_map-based implementation to convert any enumerables it finds to lazy ones:

module Enumerable
  def lazy_flatten
    self.lazy.flat_map do |x|
      x.respond_to?(:lazy) ? x.lazy : x
    end
  end
end

This works even for nested lazy enumerables, as Lazy#lazy is smart enough to return itself.

Upvotes: 1

Mladen Jablanović
Mladen Jablanović

Reputation: 44080

  1. This doesn't seem lazy to me, as you are still performing old (non-lazy) flatten beneath.
  2. Enumerator is Enumerable, so I think you don't need to handle it separately.
  3. I would expect lazy_flatten to be method on Enumerable.

Here's how I would implement it:

module Enumerable
  def lazy_flatten
    Enumerator.new do |yielder|
      each do |element|
        if element.is_a? Enumerable
          element.lazy_flatten.each do |e|
            yielder.yield(e)
          end
        else
          yielder.yield(element)
        end
      end
    end
  end
end

Upvotes: 4

Related Questions