Reputation: 4879
I'm trying to take a uuid
and convert it back into the bytes it was generated from.
I've been studying the SecureRandom
source to see if I can reverse engineer the UUID back into bytes, but I'm having a hard time with it.
What I need to do is basically the inverse of this:
def self.uuid
ary = self.random_bytes(16).unpack("NnnnnN")
ary[2] = (ary[2] & 0x0fff) | 0x4000
ary[3] = (ary[3] & 0x3fff) | 0x8000
"%08x-%04x-%04x-%04x-%04x%08x" % ary
end
So I have this uuid:
"4b6d2066-78ac-49db-b8c4-9f58d8e8842f"
Which started out as this string of bytes:
"Km fx\xAC\xC9\xDB\xF8\xC4\x9FX\xD8\xE8\x84/"
Which was this unpacked:
[
[0] 1265442918,
[1] 30892,
[2] 51675,
[3] 63684,
[4] 40792,
[5] 3639116847
]
And this after ary[2]
and ary[3]
where modified
[
[0] 1265442918,
[1] 30892,
[2] 18907,
[3] 47300,
[4] 40792,
[5] 3639116847
]
So I got the original uuid split back out into its unjoined parts:
[
[0] "4b6d2066",
[1] "78ac",
[2] "49db",
[3] "b8c4",
[4] "9f58",
[5] "d8e8842f"
]
The problem I'm having is I'm not sure what the inverse of these 2 lines are:
ary[2] = (ary[2] & 0x0fff) | 0x4000
ary[3] = (ary[3] & 0x3fff) | 0x8000
And I'm also not sure how to turn the remaining elements back into their integer values. I'm sure it is some form of pack
or unpack
, but I'm not sure what it needs.
Upvotes: 1
Views: 2013
Reputation: 78561
The problem I'm having is I'm not sure what the inverse of these 2 lines are:
ary[2] = (ary[2] & 0x0fff) | 0x4000 ary[3] = (ary[3] & 0x3fff) | 0x8000
Those two lines aren't revertible… They're doing bit arithmetics to set the version and variant bits of a v4 (aka random) uuid, for RFC 4122 compliance.
http://www.ietf.org/rfc/rfc4122.txt
The numbers should correspond to:
byte & UUID_CLEAR_VER | UUID_VERSION_4 = byte & b'00001111' | b'01000000'
byte & UUID_CLEAR_VAR | UUID_VAR_RFC = byte & b'00111111' | b'10000000'
As the above lines and the rfc might hint, what you really want isn't so much the uuid's bytes as is its bits, in order reverse-engineer its components. It really is a 128-bit integer, and the hex representation is basically passing the entire thing through bin2hex and reformatting it so it's reasonably readable.
Here's a variation of similar code in PHP, for reference:
https://github.com/UnionOfRAD/lithium/blob/master/util/String.php#L55
Upvotes: 2