Can someone please help me understand the following Ruby snippet?

Question

I recently ran into a permgen memory leak running Sinatra on JRuby in Tomcat. The problem had to do with the Tilt library that Sinatra uses to support various templating options. The old code (which is not included here) was generating the memory leak. The new code (below) does not, and in fact I see that permgen GC is now working.

Ruby is supposed to be self describing, but I couldn't figure out this code by reading it. There are nested class evals. Why? Why is a method being defined and then unbound?

Why is code that compiles a bunch of templates and keeps them around for re-use so complicated looking?

Also: if there are any GitHub employees looking at this question, can you please add some functionality to GitHub that allows users to insert a question on a code snippet?

(This code was lifted from https://github.com/rtomayko/tilt/blob/master/lib/tilt.rb)

def compile_template_method(locals)  
  source, offset = precompiled(locals)  
  offset += 5  
  method_name = "__tilt_#{Thread.current.object_id.abs}"  
  Object.class_eval <<-RUBY, eval_file, line - offset  
    #{extract_magic_comment source}  
    TOPOBJECT.class_eval do  
      def #{method_name}(locals)    
        Thread.current[:tilt_vars] = [self, locals]  
        class << self  
          this, locals = Thread.current[:tilt_vars]  
          this.instance_eval do  
            #{source}  
          end  
        end  
      end  
    end  
  RUBY  
  unbind_compiled_method(method_name)  
end

Scott · Accepted Answer

There are nested class evals. Why?

Rather than being elegant self-describing code as you would reasonably expect, this method looks like it is from battle-scarred, fixed and patched production code (so perhaps we can forgive them a little).

So why two evals? Before the second nested 'real' template method code can be eval'd, the code that is to be eval'd must be prefixed with the correct source encoding which may have been defined as a "magic comment" in the template file.

Once the string encoding is set correctly, the real class_eval can be attempted. Another way of saying that could be "This is source code that writes source code that writes source code"!

Presumably, this is to fix compatibility issues that could arise in Ruby 1.9 where the template being compiled may contain a character encoding (UTF-8) that is different to the encoding of the Tilt library source code itself (US-ASCII Encoding), which would result in incorrect evaluation of template strings (because the string encoding would be already set in the host code which is calling the template file).

Why is a method being defined and then unbound?

To clarify: In Ruby, unbound is not the same as undefined.

Unbound methods exist as free method objects of type UnboundMethod that can be called, although they are no longer associated with a particular object. An unbound method no longer has a receiver.

In order to create an unbound method, it first has be bound to (defined against) an object. This is why the compiled template method is quickly removed from the top-level object because it was only a temporary arrangement necessary to generate the unbound method.

This technique is used to make it possible to use compiled templates which are scoped against different instances of a given class, without changing the root object or the third-party developer's client class in any visible or permanent way.

By disassociating the compiled template method from a specific client code object, the compiled template method can be rebound later on to new instances of that object's class during future calls to templates that use objects of that type.

For example, given the following ERB template:

Hello <%= @name %>

... and the following calling code:

scott = Person.new
scott.name = "Scott"
output = template.render(scott)
=> "Hello Scott"

During this first render, the template is eval'd and compiled against an instance of the TOPOBJECT object. The compiled template method will be named something like "__tilt_2151955260". This method is then unbound to be used again against all instances of type TOPOBJECT (which is simply Object or BasicObject depending upon the Ruby version), and therefore could be used against any client object type.

The next time the template is rendered, the compiled template method is bound with the 'baq' instance of TOPOBJECT:

baq = Person.new
baq.name = "Baq"
output = template.render(baq)

Under the hood, when template.render(baq) is called, the unbound compiled template method is being bound against the 'baq' instance of Person:

__tilt_2151955260.bind(baq).call

Not having to call class_eval every time results in considerable performance gains.

Why is code that compiles a bunch of templates and keeps them around for re-use so complicated looking?

My assessment is that although the code implementation does indeed look unnecessarily complex at first sight, these layers of indirection are often necessary in framework code that aims to make a public API incredibly simple and sweet to consume for many thousands of other developers, even if it is at the expense of the few developers that have to maintain it.

The code complexity (double eval nesting) has also increased as a result of real-world issues arising from an API that is consumed in many different locales and hence many encodings from around the world.

Footnote: The Template class referred to in the question has since been refactored into a separate file github.com/rtomayko/tilt/blob/master/lib/tilt/template.rb

Can someone please help me understand the following Ruby snippet?

Answers (2)

Related Questions