Reputation: 11834
I have a hex string of unknown (variable) length and I want to pack or unpack at any time to convert to bytes.
["a"].pack("H*")
# => "\xA0"
I'm getting \xA0
-- is that because it is little-endian? I was expecting \xA
or \x0A
.
In the same manner I'm also getting a0
hex string if unpacking again, i.e.,:
["a"].pack("H*").unpack("H*").first
=> "a0"
Again, I was expecting a
or 0a
, so I'm a bit confused. Is this all the same?
I would prefer big-endian for hex strings but it appears that .pack
does not accept endianness for H
:
["a"].pack("H>*").unpack("H>*")
ArgumentError: '<' allowed only after types sSiIlLqQjJ (ArgumentError)
from <internal:pack>:8:in `pack'
How can I get a big-endian hex values from unpack?
Upvotes: 3
Views: 129
Reputation: 13715
Let's collect some facts.
First, from How to identify the endianness of given hex string? :
"Bytes don't have endianness." – @MichaelPetch
– @VC.One
You only get endianness once you start stringing bytes together. So that's not at issue here.
Next, from What does ["string"].pack('H*') mean? :
[
Array.pack
] interprets the string as hex numbers, two characters per byte, and converts it to a string with the characters with the corresponding ASCII code.
So your string "a"
, being one character, doesn't describe even one full byte.
Finally, from the packed_data docs :
['fff'].pack('h3') # => "\xFF\x0F" ['fff'].pack('h4') # => "\xFF\x0F" ['fff'].pack('h5') # => "\xFF\x0F\x00" ['fff'].pack('h6') # => "\xFF\x0F\x00"
This shows that input strings that are
are treated as though they were right-padded with 0
.
Putting all this together, it becomes clear that what's happening is that Array.pack
is, in effect, padding your too-short input string "a"
with a 0
on the right so that it can work with it at all, and everything treats the input as the string "a0"
from there.
If you're not satisfied with that behavior, the one lever you can pull is to swap H*
for h*
, which according to the docs trades "high nibble first" for "low nibble first."
Here's an illustration of the effects of that change. (I'll use f
instead of a
, because \x0A
gets rewritten as \n
, making the effect harder to see.)
# Determines how order of nibbles ("half-bytes") is interpreted
["f0"].pack("H*") # => "\xF0"
["f0"].pack("h*") # => "\x0F"
["0f"].pack("H*") # => "\x0F"
["0f"].pack("h*") # => "\xF0"
# Always right-pads input ("f" matches behavior of "f0…", never "…0f")
["f"].pack("H*") # => "\xF0"
["f"].pack("h*") # => "\x0F"
["f"].pack("H4") # => "\xF0\x00"
["f"].pack("h4") # => "\x0F\x00"
# Changes nothing in the round-trip conversion
["f0"].pack("H*").unpack("H*") # => ["f0"]
["f0"].pack("h*").unpack("h*") # => ["f0"]
["0f"].pack("H*").unpack("H*") # => ["0f"]
["0f"].pack("h*").unpack("h*") # => ["0f"]
It seems like this nibble-ordering is what you had in mind when you asked about endianness, so I hope this helps. However, note that whichever nibble order you choose, a 1-character input string will always be right-padded with a zero, never left-padded.
Upvotes: 4
Reputation: 34328
Endianess is not a concept that is relevant to bytes, but when converting hex to bytes the Ruby pack
method instead talks about nibbles. A nibble is the 4 bit part of an 8 bit byte. So in one byte there are two nibbles.
You have H
which packs high nibble first, which is the normal way you'd expect hex to be packed into bytes, and then you have h
which packs low nibble first.
What you're looking for is H
, high nibble first (what you refer to as big-endan), the only thing you're missing is that pack will interpret odd-length hex strings with the high nibble part packed first.
Therefore "a"
will always be interpreted as 0xA0
by pack, when what you want is 0x0A
.
In order to fix this problem all you need to do is pad odd length hex strings with a 0
in the beginning, and you should get the results you want.
def hexpack(str)
[(str.length.odd? ? "0" + str : str)].pack("H*")
end
hexpack("a").unpack("H*")
=> ["0a"]
hexpack("ab").unpack("H*")
=> ["ab"]
hexpack("abc").unpack("H*")
=> ["0abc"]
hexpack("10203").bytes
=> [1, 2, 3]
Upvotes: 3