Aidan Steele
Aidan Steele

Reputation: 11330

What is this variable-length integer encoding?

I am documenting an old file format and have stumped myself with the following issue.

It seems to be that integers are variable-length encoded, with numbers <= 0x7F encoded in a single byte, but >= 0x80 are encoded in two bytes. An example set of integers and their encoded counterparts:

I have yet to come across any numbers that are larger than 0xFFFF, so I can't be sure if/how they are encoded. For the life of me, I can't work out the pattern here. Any ideas?

Upvotes: 3

Views: 12412

Answers (1)

Matti Virkkunen
Matti Virkkunen

Reputation: 65156

At a glance it looks like the numbers are split into 7-bit chunks, each of which is encoded as the 7 least significant bits of an output byte, while the most significant bit signifies whether there are more bytes following this one (i.e. the last byte of an encoded integer has 0 as its MSB).

The least significant bits of the input come first, so I guess you could call this "little endian".

Edit: see https://en.wikipedia.org/wiki/Variable-length_quantity (this is used in MIDI and Google protocol buffers)

Upvotes: 11

Related Questions