SFEley
SFEley

Reputation: 7786

Unpack signed little-endian in Ruby

So I'm working on some MongoDB protocol stuff. All integers are signed little-endian. Using Ruby's standard Array#pack method, I can convert from an integer to the binary string I want just fine:

positive_one = Array(1).pack('V')   #=> '\x01\x00\x00\x00'
negative_one = Array(-1).pack('V')  #=> '\xFF\xFF\xFF\xFF'

However, going the other way, the String#unpack method has the 'V' format documented as specifically returning unsigned integers:

positive_one.unpack('V').first #=> 1
negative_one.unpack('V').first #=> 4294967295

There's no formatter for signed little-endian byte order. I'm sure I could play games with bit-shifting, or write my own byte-mangling method that doesn't use array packing, but I'm wondering if anyone else has run into this and found a simple solution. Thanks very much.

Upvotes: 2

Views: 5918

Answers (4)

Ken Bloom
Ken Bloom

Reputation: 58810

After unpacking with "V", you can apply the following conversion

class Integer
  def to_signed_32bit
    if self & 0x8000_0000 == 0x8000_0000
      self - 0x1_0000_0000  
    else
      self
    end
  end
end

You'll need to change the magic constants 0x1_0000_0000 (which is 2**32) and 0x8000_0000 (2**31) if you're dealing with other sizes of integers.

Upvotes: 2

SFEley
SFEley

Reputation: 7786

For the sake of posterity, here's the method I eventually came up with before spotting Paul Rubel's link to the "classical method". It's kludgy and based on string manipulation, so I'll probably scrap it, but it does work, so someone might find it interesting for some other reason someday:

# Returns an integer from the given little-endian binary string.
# @param [String] str
# @return [Fixnum]
def self.bson_to_int(str)
  bits = str.reverse.unpack('B*').first   # Get the 0s and 1s
  if bits[0] == '0'   # We're a positive number; life is easy
    bits.to_i(2)
  else                # Get the twos complement
    comp, flip = "", false
    bits.reverse.each_char do |bit|
      comp << (flip ? bit.tr('10','01') : bit)
      flip = true if !flip && bit == '1'
    end
    ("-" + comp.reverse).to_i(2)
  end
end

UPDATE: Here's the simpler refactoring, using a generalized arbitrary-length form of Ken Bloom's answer:

# Returns an integer from the given arbitrary length little-endian binary string.
# @param [String] str
# @return [Fixnum]
def self.bson_to_int(str)
  arr, bits, num = str.unpack('V*'), 0, 0
  arr.each do |int|
    num += int << bits
    bits += 32
  end
  num >= 2**(bits-1) ? num - 2**bits : num  # Convert from unsigned to signed
end

Upvotes: 1

Mark Wilkins
Mark Wilkins

Reputation: 41252

Edit I misunderstood the direction you were converting originally (according to the comment). But after thinking about it some, I believe the solution is still the same. Here is the updated method. It does the exact same thing, but the comments should explain the result:

def convertLEToNative( num )
    # Convert a given 4 byte integer from little-endian to the running
    # machine's native endianess.  The pack('V') operation takes the
    # given number and converts it to little-endian (which means that
    # if the machine is little endian, no conversion occurs).  On a
    # big-endian machine, the pack('V') will swap the bytes because
    # that's what it has to do to convert from big to little endian.  
    # Since the number is already little endian, the swap has the
    # opposite effect (converting from little-endian to big-endian), 
    # which is what we want. In both cases, the unpack('l') just 
    # produces a signed integer from those bytes, in the machine's 
    # native endianess.
    Array(num).pack('V').unpack('l')
end

Probably not the cleanest, but this will convert the byte array.

def convertLEBytesToNative( bytes )
    if ( [1].pack('V').unpack('l').first == 1 )
        # machine is already little endian
        bytes.unpack('l')
    else
        # machine is big endian
        convertLEToNative( Array(bytes.unpack('l')))
    end
end

Upvotes: 2

Paul Rubel
Paul Rubel

Reputation: 27252

This question has a method for converting signed to unsigned that might be helpful. It also has a pointer to the bindata gem which looks like it will do what you want.

BinData::Int16le.read("\000\f") # 3072

[edited to remove the not-quite-right s unpack directive]

Upvotes: 2

Related Questions