eem
eem

Reputation: 153

How is it that Ruby methods requiring blocks can just use Procs instead?

I am learning Ruby and was looking at the documentation of Array#map here, and it says the the syntax is as follows

map {|element| ... } → new_array
map → new_enumerator

But we are able to do arr.map(&:to_s) to stringify each element of the array. As far as I understand, &:to_s is just syntactic sugar for :to_s.to_proc so (as given here), it means that map accepts a Proc object as argument. But its method signature says otherwise.

I have a few questions regarding this.

  1. Can someone please explain this behavior and point to the relevant documentation for the same?

  2. What exactly does Proc mean in this context? Isn't to_s a method of the underlying class? What does it mean for me to a pass a Proc object of to_s which has no information about the underlying class on which it is going to be called.

Any help would be great!

Upvotes: 3

Views: 167

Answers (2)

engineersmnky
engineersmnky

Reputation: 29588

While @user513951's answer is excellent and directly addresses your questions, I had originally added a few comments that I thought beneficial.

After some consideration I believe they add value to the post itself and as such I thought it best to memorialize them as an "Answer" of sorts.

Procs and lambdas

In ruby Proc objects come in 2 flavors, a Proc with "lambda-ness" and one without.

  • non-lambda Procs are formed via: Proc.new or proc { }
  • lambda Procs (Procs with "lamda-ness" turned on) are formed via: lambda { } or -> { }

As of Ruby 3.0 all core implementations of to_proc return a Proc with "lambda-ness" turned on. The only exception is when calling Proc#to_proc where the Proc does not have "lambda-ness" turned on.

Symbol#to_proc returns a lambda that is akin to

symbol = :to_s
to_s_to_proc = ->(obj,*rest) { obj.public_send(symbol,*rest) }

This proc has an arity of -2:

  • 1 required argument
  • n number of optional arguments to be passed to the method call

In most cases Symbol#to_proc will be used as you have shown e.g. map(&:to_s). In this case you don't notice the fact that this lambda accepts other arguments because each element of the Enumerator is passed sequentially as the required argument (obj in our example); however, we can use it as a standalone for instance

to_s_to_proc[100,2] 
#=> "1100100"

This is the result of calling 100.to_s(2).

Other classes implement to_proc as well. In the core library this includes: Hash, Method, Proc, and Enumerator::Yielder. Each implementation has its own purpose, its own parameters, and its own arity.

Why does this matter? Docs

Procs are coming in two flavors: lambda and non-lambda (regular procs). Differences are:

In lambdas, return and break means exit from this lambda;

In non-lambda procs, return means exit from embracing method (and will throw LocalJumpError if invoked outside the method);

In non-lambda procs, break means exit from the method which the block given for. (and will throw LocalJumpError if invoked after the method returns);

In lambdas, arguments are treated in the same way as in methods: strict, with ArgumentError for mismatching argument number, and no additional argument processing;

Regular procs accept arguments more generously: missing arguments are filled with nil, single Array arguments are deconstructed if the proc has multiple arguments, and there is no error raised on extra arguments.

How is this pertinent to the post?

Well as we identified to_proc generally speaking will return a lambda Proc; however blocks are converted to non-lambda Procs.

def show_me(&block) = block 
show_me { } 
#=> #<Proc:0x00007f2a460afe70 (irb):2>
show_me(&:to_s) 
#=> #<Proc:0x00007f2a460a0538(&:to_s) (lambda)>

Due to the differences noted above this can cause inconsistencies in the usage of an explicit or implicit block vs using to_proc implementations. For instance:

def show_me(&block) = block.call(1,2) 

show_me {|a,b,c,d| [a,b,c,d] } 
#=> [1,2,nil,nil]
show_me { } 
#=> nil 
show_me(&{}) 
#=> wrong number of arguments (given 2, expected 1) (ArgumentError)

The error is because Hash#to_proc is essentially equal to ->(key) { hash[key]}. This lambda only accepts a single argument, the key we intend to retrieve a value for but we called it with 2 arguments block.call(1,2).

This also applies to implicit blocks e.g. def show_me_implicitly = yield 1,2

More Info in the docs for Proc#lambda?

So while the sugar provided by &object_that_implements_to_proc is useful and generally considered idiomatic, there are some implications that one should be aware of in regards to their interchangeability.

Upvotes: 2

user513951
user513951

Reputation: 13715

It's not exactly accurate to say "&:to_s is just syntactic sugar for :to_s.to_proc", because that glosses over the rest of what the & character is doing.

You are likely aware that there is a "literal block" syntax in Ruby with two variations (docs):

foo { 1 + 1 }

foo do
  1 + 1
end

A Proc object is an object with the special property that it is allowed to be used in place of the literal block syntax.

So, for any method that accepts a literal block, you can instead pass it a Proc object—but only by using the & syntax (docs):

my_proc = Proc.new { 1 + 1 }

foo(&my_proc)

The & syntax means "use this Proc object in place of this method's block argument, instead of as a regular positional argument."

This is where the "sugar" comes in. If you use the & syntax to pass a non-Proc object, Ruby does you the favor of attempting to call to_proc on that object, to turn it into a Proc. You could do the same thing yourself, but you don't have to:

# Equivalent:
foo(&not_a_proc.to_proc)
foo(&not_a_proc)

In your example, the object you passed using the & syntax, :to_s, is a Symbol object. Because it is not a Proc object, Ruby calls to_proc on it. There is a method Symbol#to_proc (docs) that turns the symbol :to_s into a Proc that is approximately1 equivalent to
{ |obj| obj.to_s }.

The end result:

arr = [1,2,3]
my_proc = Proc.new { |obj| obj.to_s }

# Equivalent:
arr.map { |obj| obj.to_s }
arr.map(&my_proc)
arr.map(&:to_s.to_proc)
arr.map(&:to_s)

1 (For some of the differences that make it only "approximately", see comments below and What's the difference between a proc and a lambda in Ruby? )


As for

But its method signature says otherwise.

the documentation is a little fuzzy on how blocks and Procs are represented. There is this section of the Proc class documentation (docs) that addresses it indirectly:

Creation

There are several methods to create a Proc

  • Receiving a block of code into proc argument (note the &):

    def make_proc(&block)
      block
    end
    
    proc3 = make_proc {|x| x**2 }
    

So, if you were to implement the map method yourself, the argument signature could look like this:

def map(&block)

But as you can see, in order to show the needed block parameters, Ruby method documentation is instead written as you quoted it:

map {|element| ... } → new_array

and I don't think there's any place in the documentation that explains that exact relationship any better than what's already linked above.

Upvotes: 5

Related Questions