Reputation: 6322
Nginx can be configured to generate a uuid suitable for client identification. Upon receiving a request from a new client, it appends a uuid in two forms before forwarding the request upstream to the origin server(s):
CgIGR1ZfUkeEXQ2YAwMZAg==
)4706020A47525F56980D5D8402190303
)I want to convert a hexadecimal representation to the Base64 equivalent. I have a working solution in Ruby, but I don't fully grasp the underlying mechanics, especially the switching of byte-orders:
hex_str = "4706020A47525F56980D5D8402190303"
Treating hex_str
as a sequence of high-nibble (most significant 4 bits first) binary data, produce the (ASCII-encoded) string representation:
binary_seq = [hex_str].pack("H*")
# 47 (71 decimal) -> "G"
# 06 (6 decimal) -> "\x06" (non-printable)
# 02 (2 decimal) -> "\x02" (non-printable)
# 0A (10 decimal) -> "\n"
# ...
#=> "G\x06\x02\nGR_V\x98\r]\x84\x02\x19\x03\x03"
Map binary_seq
to an array of 32-bit little-endian unsigned integers. Each 4 characters (4 bytes = 32 bits) maps to an integer:
data = binary_seq.unpack("VVVV")
# "G\x06\x02\n" -> 167904839 (?)
# "GR_V" -> 1449087559 (?)
# "\x98\r]\x84" -> 2220690840 (?)
# "\x02\x19\x03\x03" -> 50534658 (?)
#=> [167904839, 1449087559, 2220690840, 50534658]
Treating data
as an array of 32-bit big-endian unsigned integers, produce the (ASCII-encoded) string representation:
network_seq = data.pack("NNNN")
# 167904839 -> "\n\x02\x06G" (?)
# 1449087559 -> "V_RG" (?)
# 2220690840 -> "\x84]\r\x98" (?)
# 50534658 -> "\x03\x03\x19\x02" (?)
#=> "\n\x02\x06GV_RG\x84]\r\x98\x03\x03\x19\x02"
Encode network_seq
in Base64 string:
Base64.encode64(network_seq).strip
#=> "CgIGR1ZfUkeEXQ2YAwMZAg=="
My rough understanding is that big-endian is the standard byte-order for network communications, while little-endian is more common on host machines. Why nginx provides two forms that require switching byte order to convert I'm not sure.
I also don't understand how the .unpack("VVVV")
and .pack("NNNN")
steps work. I can see that G\x06\x02\n
becomes \n\x02\x06G
, but I don't understand the steps that get there. For example, focusing on the first 8 digits of hex_str
, why do .pack(H*)
and .unpack("VVVV")
produce:
"4706020A" -> "G\x06\x02\n" -> 167904839
whereas converting directly to base-10 produces:
"4706020A".to_i(16) -> 1191576074
? The fact that I'm asking this shows I need clarification on what exactly is going on in all these conversions :)
Upvotes: 3
Views: 1350