user2012677
user2012677

Reputation: 5735

Ruby: Understanding .to_enum better

I have been reading this:

https://docs.ruby-lang.org/en/2.4.0/Enumerator.html

I am trying to understand why someone would use .to_enum, I mean how is that different than just an array? I see :scan was passed into it, but what other arguments can you pass into it?

Why not just use .scan in the case below? Any advice on how to understand .to_enum better?

"Hello, world!".scan(/\w+/)                     #=> ["Hello", "world"]
"Hello, world!".to_enum(:scan, /\w+/).to_a      #=> ["Hello", "world"]
"Hello, world!".to_enum(:scan).each(/\w+/).to_a #=> ["Hello", "world"]

Upvotes: 0

Views: 599

Answers (1)

tadman
tadman

Reputation: 211560

Arrays are, necessarily, constructs that are in memory. An array with a a lot of entries takes up a lot of memory.

To put this in context, here's an example, finding all the "palindromic" numbers between 1 and 1,000,000:

# Create a large array of the numbers to search through
numbers = (1..1000000).to_a

# Filter to find palindromes
numbers.select do |i|
  is = i.to_s
  is == is.reverse
end

Even though there's only 1998 such numbers, the entire array of a million needs to be created, then sifted through, then kept around until garbage collected.

An enumerator doesn't necessarily take up any memory at all, not in a consequential way. This is way more efficient:

# Uses an enumerator instead
numbers = (1..1000000).to_enum

# Filtering code looks identical, but behaves differently
numbers.select do |i|
  is = i.to_s
  is == is.reverse
end

You can even take this a step further by making a custom Enumerator:

palindromes = Enumerator.new do |y|
  1000000.times do |i|
    is = (i + 1).to_s

    y << i if (is == is.reverse)
  end
end

This one doesn't even bother with filtering, it just emits only palindromic numbers.

Enumerators can also do other things like be infinite in length, whereas arrays are necessarily finite. An infinite enumerator can be useful when you want to filter and take the first N matching entries, like in this case:

# Open-ended range, new in Ruby 2.6. Don't call .to_a on this!
numbers = (1..).to_enum

numbers.lazy.select do |i|
  is = i.to_s
  is == is.reverse
end.take(1000).to_a

Using .lazy here means it does the select, then filters through take with each entry until the take method is happy. If you remove the lazy it will try and evaluate each stage of this to completion, which on an infinite enumerator never happens.

Upvotes: 2

Related Questions