Reputation: 11330
I am documenting an old file format and have stumped myself with the following issue.
It seems to be that integers are variable-length encoded, with numbers <= 0x7F
encoded in a single byte, but >= 0x80
are encoded in two bytes. An example set of integers and their encoded counterparts:
0x390
is encoded as 0x9007
0x150
is encoded as 0xD002
0x82
is encoded as 0x8201
0x89
is encoded as 0x8901
I have yet to come across any numbers that are larger than 0xFFFF
, so I can't be sure if/how they are encoded. For the life of me, I can't work out the pattern here. Any ideas?
Upvotes: 3
Views: 12412
Reputation: 65156
At a glance it looks like the numbers are split into 7-bit chunks, each of which is encoded as the 7 least significant bits of an output byte, while the most significant bit signifies whether there are more bytes following this one (i.e. the last byte of an encoded integer has 0 as its MSB).
The least significant bits of the input come first, so I guess you could call this "little endian".
Edit: see https://en.wikipedia.org/wiki/Variable-length_quantity (this is used in MIDI and Google protocol buffers)
Upvotes: 11