Eugene
Eugene

Reputation: 4879

Convert UUID to bytes

I'm trying to take a uuid and convert it back into the bytes it was generated from.

I've been studying the SecureRandom source to see if I can reverse engineer the UUID back into bytes, but I'm having a hard time with it.

What I need to do is basically the inverse of this:

def self.uuid
  ary = self.random_bytes(16).unpack("NnnnnN")
  ary[2] = (ary[2] & 0x0fff) | 0x4000
  ary[3] = (ary[3] & 0x3fff) | 0x8000
  "%08x-%04x-%04x-%04x-%04x%08x" % ary
end

So I have this uuid:

"4b6d2066-78ac-49db-b8c4-9f58d8e8842f"

Which started out as this string of bytes:

"Km fx\xAC\xC9\xDB\xF8\xC4\x9FX\xD8\xE8\x84/"

Which was this unpacked:

[
    [0] 1265442918,
    [1] 30892,
    [2] 51675,
    [3] 63684,
    [4] 40792,
    [5] 3639116847
]

And this after ary[2] and ary[3] where modified

[
    [0] 1265442918,
    [1] 30892,
    [2] 18907,
    [3] 47300,
    [4] 40792,
    [5] 3639116847
]

So I got the original uuid split back out into its unjoined parts:

[
    [0] "4b6d2066",
    [1] "78ac",
    [2] "49db",
    [3] "b8c4",
    [4] "9f58",
    [5] "d8e8842f"
]

The problem I'm having is I'm not sure what the inverse of these 2 lines are:

ary[2] = (ary[2] & 0x0fff) | 0x4000
ary[3] = (ary[3] & 0x3fff) | 0x8000

And I'm also not sure how to turn the remaining elements back into their integer values. I'm sure it is some form of pack or unpack, but I'm not sure what it needs.

Upvotes: 1

Views: 2013

Answers (1)

Denis de Bernardy
Denis de Bernardy

Reputation: 78561

The problem I'm having is I'm not sure what the inverse of these 2 lines are:

ary[2] = (ary[2] & 0x0fff) | 0x4000
ary[3] = (ary[3] & 0x3fff) | 0x8000

Those two lines aren't revertible… They're doing bit arithmetics to set the version and variant bits of a v4 (aka random) uuid, for RFC 4122 compliance.

http://www.ietf.org/rfc/rfc4122.txt

The numbers should correspond to:

byte & UUID_CLEAR_VER | UUID_VERSION_4 = byte & b'00001111' | b'01000000'
byte & UUID_CLEAR_VAR | UUID_VAR_RFC   = byte & b'00111111' | b'10000000'

As the above lines and the rfc might hint, what you really want isn't so much the uuid's bytes as is its bits, in order reverse-engineer its components. It really is a 128-bit integer, and the hex representation is basically passing the entire thing through bin2hex and reformatting it so it's reasonably readable.

Here's a variation of similar code in PHP, for reference:

https://github.com/UnionOfRAD/lithium/blob/master/util/String.php#L55

Upvotes: 2

Related Questions