Andrew Moore
Andrew Moore

Reputation: 23

Advice for decoding binary/hex WAV file metadata - Pro Tools UMID chunk

Pro Tools (AVID's DAW software) has a process for managing and linking to all of it's unique media using a Unique ID field, which gets embedded in to the WAV file in the form of a umid metadata chunk. Examining a particular file inside Pro Tools, I can see that the file's Unique ID comes in the form of an 11 character string, looking like: rS9ipS!x6Tf.

When I examine the raw data inside the WAV file, I find a 32-byte block of data - 4 bytes for the chars 'umid'; 4 bytes for the size of the following data block - 24; then the 24-byte data block, which, when examined in Hex Fiend, looks like this:

00000000 0000002A 5B7A5FFB 0F23DB11 00000000 00000000

As you can see, there are only 9 bytes that contain any non-zero information, but this is somehow being used to store the 11 char Unique ID field. It looks to me as if something is being done to interpret this raw data to retrieve that Unique ID string, but all my attempts to decode the raw data have not been at all fruitful. I have tried using https://gchq.github.io/CyberChef/ to run it through all the different formats that would make sense, but nothing it pointing me in the right direction. I have also tried looking at the data in 6-bit increments to see if it's being compressed in some way (9 bytes * 8 bits == 72 == 12 blocks * 6 bits) but have not had any luck stumbling on a pattern yet.

So I'm wondering if anyone has any specific tips/tricks/suggestions on how best to figure out what might be happening here - how to unpack this data in such a way that I might be able to end up with enough information to generate those 11 chars, of what I'm guessing would most likely be UTF-8.

Any and all help/suggestions welcome! Thanks.

Upvotes: 2

Views: 582

Answers (1)

Mark Reid
Mark Reid

Reputation: 26

It seems to be a base64 encoding only with a slightly different character map, here is my python implementation that I find best matches Pro Tools.

char_map = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789#!"

def encode_unique_id(uint64_value):
    # unique id is a uint64_t, clamp
    value = uint64_value & 0xFFFFFFFFFFFFFFFF
    if value == 0:
        return ""

    # calculate the min number of bytes
    # needed store value for int
    byte_length = 0
    tmp = value
    while tmp:
        tmp =tmp >> 8
        byte_length += 1

    # calculate number of chars needed to store encoding
    char_total, remainder = divmod(byte_length * 8, 6)
    if remainder:
        char_total += 1

    s = ""
    for i in range(char_total):
        value, index = divmod(value, 64)
        s += char_map[index]
    return s

Running encode_unique_id(0x2A5B7A5FFB0F23DB11) should give you rS9ipS!x6Tf

Upvotes: 1

Related Questions