ivan
ivan

Reputation: 6322

Converting nginx uuid from hex to Base64: how is byte-order involved?

Nginx can be configured to generate a uuid suitable for client identification. Upon receiving a request from a new client, it appends a uuid in two forms before forwarding the request upstream to the origin server(s):

I want to convert a hexadecimal representation to the Base64 equivalent. I have a working solution in Ruby, but I don't fully grasp the underlying mechanics, especially the switching of byte-orders:

hex_str = "4706020A47525F56980D5D8402190303"

Treating hex_str as a sequence of high-nibble (most significant 4 bits first) binary data, produce the (ASCII-encoded) string representation:

binary_seq = [hex_str].pack("H*")

# 47 (71 decimal) -> "G"
# 06  (6 decimal) -> "\x06" (non-printable)
# 02  (2 decimal) -> "\x02" (non-printable)
# 0A (10 decimal) -> "\n"
# ...

#=> "G\x06\x02\nGR_V\x98\r]\x84\x02\x19\x03\x03"

Map binary_seq to an array of 32-bit little-endian unsigned integers. Each 4 characters (4 bytes = 32 bits) maps to an integer:

data = binary_seq.unpack("VVVV")

# "G\x06\x02\n"      ->  167904839 (?)
# "GR_V"             -> 1449087559 (?)
# "\x98\r]\x84"      -> 2220690840 (?)
# "\x02\x19\x03\x03" ->   50534658 (?)

#=> [167904839, 1449087559, 2220690840, 50534658]

Treating data as an array of 32-bit big-endian unsigned integers, produce the (ASCII-encoded) string representation:

network_seq = data.pack("NNNN")

#  167904839 -> "\n\x02\x06G"      (?)
# 1449087559 -> "V_RG"             (?)
# 2220690840 -> "\x84]\r\x98"      (?)
#   50534658 -> "\x03\x03\x19\x02" (?)

#=> "\n\x02\x06GV_RG\x84]\r\x98\x03\x03\x19\x02"

Encode network_seq in Base64 string:

Base64.encode64(network_seq).strip

#=> "CgIGR1ZfUkeEXQ2YAwMZAg=="

My rough understanding is that big-endian is the standard byte-order for network communications, while little-endian is more common on host machines. Why nginx provides two forms that require switching byte order to convert I'm not sure.

I also don't understand how the .unpack("VVVV") and .pack("NNNN") steps work. I can see that G\x06\x02\n becomes \n\x02\x06G, but I don't understand the steps that get there. For example, focusing on the first 8 digits of hex_str, why do .pack(H*) and .unpack("VVVV") produce:

"4706020A" -> "G\x06\x02\n" -> 167904839

whereas converting directly to base-10 produces:

"4706020A".to_i(16) -> 1191576074

? The fact that I'm asking this shows I need clarification on what exactly is going on in all these conversions :)

Upvotes: 3

Views: 1350

Answers (0)

Related Questions