Reputation: 739

why does Enumerator include Enumerable

Dig this, here is a cool Enumerator (lazy sequence) from 1 to (the biggest Float Ruby can represent):

1.9.3-p327 :014 > e = (1..Float::INFINITY).each

Look at how we can grab the front of the sequence:

1.9.3-p327 :015 > e.first
 => 1 
1.9.3-p327 :016 > e.take(2)
 => [1, 2]

That's good stuff huh? I think so too. But then this:

1.9.3-p327 :017 > e.drop(2).first

Goes into lala land. And by that I mean it doesn't return in less than 5 seconds.

Oh here is a clue:

1.9.3-p327 :020 > p e.method(:drop)
#<Method: Enumerator(Enumerable)#drop>

It appears that the Enumerator (e) got its #drop method from the Enumerable (module) mixed in to the Enumerator (class). Now why in the world would Ruby go and mix Enumerable into Enumerator you ask? I do not know. But there it is, documented in both Enumerator in Ruby 1.9.3 and Enumerator in Ruby 2.0.

The problem as I see it is that some methods defined in Enumerable work or kind of work on Enumerator. Examples include #first and #take. At least one other: #drop does not work.

It seems to me that Enumerator including Enumerable is a bug. What do you think?

PS notice that Ruby 2.0 defines Enumerator::Lazy (subclass of Enumerator) which defines a bunch of the Enumerable methods as always lazy. Something smells fishy here. Why mix in the non-lazy and in some cases broken methods (into Enumerator) only to turn around and provide lazy alternatives in a subclass (of Enumerator)?

Answers (2)

Jörg W Mittag

Reputation: 369594

That's a design choice that is common to many other collection frameworks as well.

Ruby's collection operations are not type-preserving. They always return an Array, regardless of what type of collection they were called on. That's also what, for example, .NET does, except there the type is always IEnumerable, which is both more useful (because more things can be represented as an IEnumerable than as an Array, e.g. infinite sequences) and at the same time less useful (because the interface of IEnumerable is much smaller than that of Array, so there are less operations you can do on it).

This allows Ruby's collection operations to be implemented once, without duplication.

It also means that it's very easy to integrate your own collections into Ruby's collection framework: just implement each, mixin Enumerable and you are done. If a future version of Ruby adds a new collection method (e.g. flat_map in Ruby 1.9), you don't have to do anything, it just works with your collection, too.

Another design choice would be to make all collection operations type-preserving. So, all collection operations return the type they were called on.

There are some languages which do this. It is, however, implemented by copy&pasting all collection methods into all collection classes, i.e. with massive code duplication.

This means that if you want to add your own collection to the collection framework, you have to implement every single method of the collection protocol. And if a future version of the language adds new methods, then you have to release a new version of your collection.

Scala 2.8's collection framework was the first time that someone figured out how to do type-preserving collection operations without code duplication. But that was long after Ruby's collection framework was designed. When Ruby's collection framework was designed, it was simply not yet known how to do type-preserving collection operations without code duplication, and the designers of Ruby opted against duplication.

Starting with Ruby 1.9, there is actually some duplication. Some Hash methods were duplicated to return Hashes instead of Arrays. And you already mentioned Ruby 2.0's Enumerator::Lazy, which duplicates many Enumerable methods to return Enumerator::Lazy.

It would be possible to use the same tricks Scala uses in Ruby, but it would require a complete rework of the collection framework, which would make every existing collection implementation obsolete. Scala was able to do this because at the time there was hardly any user base.

Upvotes: 3

fmendez

Reputation: 7338

In response to the first part:

"Goes into lala land. And by that I mean it doesn't return in less than 5 seconds."

That behavior seems consistent with what those methods are supposed to do:

take(n) → array # Returns first n elements from enum.

That means you just need to iterate up to N to return it.

drop(n) → array # #Drops first n elements from enum, and returns rest elements in an array.

That means that it needs the rest of the elements to able to return them. And since you upper bound is Float::INFINITY it behaves as such.

Source: Enumerable

Upvotes: 1

why does Enumerator include Enumerable

Answers (2)

Related Questions