Sander_P
Sander_P

Reputation: 1835

How does this 'loop do' end?

I am rewriting a piece of Ruby code found on github.com into Javascript. I am not having any problems understanding the code, except for the part below. The question is: how does the 'loop do' end if there are no 'break's?

  def utf8_bytes(record_size, record_tag)
    Enumerator.new do |yielder|
      bits = compressed_bits record_size, record_tag
      loop do
        # Use the Huffman tree to decode the first character.
        node = tree_root
        while node < 0x100
          # p ['node', node]
          bit = bits.next
          # p ['bit', bit]
          node = (bit == 0) ? tree_left[node] : tree_right[node]
        end
        first_byte = node - 0x100
        # p ['utf8 start', first_byte]
        yielder << first_byte

        # The other characters are 10xxxxxx, where x'es are raw bits.
        2.upto utf8_char_bytes(first_byte) do
          byte = 0b10
          6.times do
            byte = (byte << 1) | bits.next
          end
          # p ['utf8 byte', byte]
          yielder << byte
        end
      end
    end
  end


Update

Thanks for all the answers, but unfortunatly I still don't understand what is really happening. If I understand correctly, it is like a bucket. Every time you put something into it, it is being processed. And 'loop do' is done as many times as there are bytes that are put into it.

The function is calles only once, like so:

  text = utf8_bytes(record_size, record_tag).to_a.pack('C*')

But this is inside a Enumerator as well, so I guess the bytes drip from one bucket into the other.

In any case. I have translated the function into Javascript. Maybe someone can tell me if this is correct? (leaving aside that the Javascript function returns an array, and leaving aside that using arrays like this is probably not very inefficient)

    function utf8_bytes( record_size, record_tag ) {
        var yielder = new Array();
        bits = compressed_bits( record_size, record_tag );
// compressed_bits returns an array of 0's and 1's
        var v=0;
        while( v<bits.length ) {
//              # Use the Huffman tree to decode the first character.
            var node = tree_root;
            while ( node < 0x100 ) {
//                  # p ['node', node]
                bit = bits[v++];
//                  # p ['bit', bit]
                node = (bit == 0) ? tree_left[node] : tree_right[node];
            }
            var first_byte = node - 0x100;
//              # p ['utf8 start', first_byte]
            yielder.push( first_byte );

//              # The other characters are 10xxxxxx, where x'es are raw bits.
            for (var m=2; m<=utf8_char_bytes(first_byte); m++ ){
                var mbyte = 2;
                for (var n=0; n<6; n++ ) {
                    mbyte = (mbyte << 1) | bits[v++];
                }
//                  # p ['utf8 byte', mbyte]
                yielder.push( mbyte );
            }
        }
        return( yielder );
    }

Upvotes: 1

Views: 131

Answers (3)

Marc-Andr&#233; Lafortune
Marc-Andr&#233; Lafortune

Reputation: 79622

The loop never ends by itself, i.e. this method returns an infinite enumerator.

utf8_bytes(...).to_a # => never ends

These can be very useful, as the block you call them with can return before consuming the whole (infinite) enumerator:

def foo
  utf8_bytes(...).each do |byte|
    return byte if is_it_what_youre_looking_for?(byte)
  end
  # You'll never get here!
end

In a similar fashion, is useful to get just a couple of values. For example:

utf8_bytes(...).first(100) # => array of length 100

To play around with a simpler "infinite" enumerator, you can use 0..Float::INFINITY instead of calling utf8_bytes.

Upvotes: 0

PinnyM
PinnyM

Reputation: 35541

In Enumerator::Yielder, the yield method is aliased as <<. So calling:

yielder << some_byte

is the same as:

yielder.yield some_byte

Calling yield blocks the control flow. Control can return when next (or an equivalent c function) is called on the Enumerator object. If next is never called, the loop will not continue, and will remain in that state until the Enumerator falls out of scope and is garbage collected.

You can read up on the Enumerator class for more info.

Upvotes: 1

phs
phs

Reputation: 11061

It appears to be an enumerator (Note the Enumerator.new do |yielder|.) My guess is control flow returns every time the append operator (<<) is applied to yielder.

Upvotes: 1

Related Questions