Reputation: 973
What would be a good way to abbreviate UUID for use in a button in a user interface when the id is all we know about the target?
GitHub seems to abbreviate commit ids by taking 7 characters from the beginning. For example b1310ce6bc3cc932ce5cdbe552712b5a3bdcb9e5
would appear in a button as b1310ce
. While not perfect this shorter version is sufficient to look unique in the context where it is displayed. I'm looking for a similar solution that would work for UUIDs. I'm wondering is some part of the UUID is more random than another.
The most straight forward option would be splitting at dash and using the first part. The UUID 42e9992a-8324-471d-b7f3-109f6c7df99d
would then be abbreviated as 42e9992a
. All of the solutions I can come up with seem equally arbitrary. Perhaps there is some outside the box user interface design solution that I didn't think of.
Upvotes: 10
Views: 1627
Reputation: 917
Showing only the first x chars isn't a good idea for UUIDv7 since it begins with a timestamp.
Structure
UUIDv7 looks like this when represented as a string:
0190163d-8694-739b-aea5-966c26f8ad91
└─timestamp─┘ │└─┤ │└───rand_b─────┘
ver │var
rand_a
The 128-bit value consists of several parts:
timestamp (48 bits) is a Unix timestamp in milliseconds.
ver (4 bits) is a UUID version (7).
rand_a (12 bits) is randomly generated.
var* (2 bits) is equal to 10.
rand_b (62 bits) is randomly generated.
Upvotes: 1
Reputation: 2154
Entropy of a UUID is highest in the first bits for UUID V1 and V2, and evenly distributed for V3, V4 and V5. So, the first N characters are no worse than any other N characters subset.
For N=8, i.e. the group before the first dash, the odds of there being a collision within a list you could reasonably display within a single GUI screen is vanishingly small.
Upvotes: 6
Reputation: 973
After thinking about this for a while I realised that the short git commit hash is used as part of command line commands. Since this requirement does not exist for UUIDs and graphical user interfaces I simply decided to use ellipsis for the abbreviation. Like so 42e9992...
Upvotes: 0
Reputation: 18595
The question is whether you want to show part of the UUID or only ensure that unique strings are presented as shorter unique strings. If you want to focus on the latter, which appears to be the goal you are suggesting in your opening paragraph:
(...) While not perfect this shorter version is sufficient to look unique in the context where it is displayed. (...)
you can make use of hashing.
Hashing is the transformation of a string of characters into a usually shorter fixed-length value or key that represents the original string. Hashing is used to index and retrieve items in a database because it is faster to find the item using the shorter hashed key than to find it using the original value.
Hashing is very common and easy to use across many of popular languages; simple approach in Python:
import hashlib
import uuid
encoded_str = uuid.UUID('42e9992a-8324-471d-b7f3-109f6c7df99d').bytes
hash_uuid = hashlib.sha1(encoded_str).hexdigest()
hash_uuid[:10]
'b6e2a1c885'
Expectedly, a small change in string will result in a different string correctly showing uniqueness.
# Second digit is replaced with 3, rest of the string remains untouched
encoded_str_two = uuid.UUID('43e9992a-8324-471d-b7f3-109f6c7df99d').bytes
hash_uuid_two = hashlib.sha1(encoded_str_two).hexdigest()
hash_uuid_two[:10]
'406ec3f5ae'
Upvotes: 2