neha deshpande
neha deshpande

Reputation: 86

maximum field number in protobuf message

The official document for protocol buffers https://developers.google.com/protocol-buffers/docs/proto3 says the maximum field number for fields in protobuf message is 2^29-1. But why is this limit? Please anyone can explain in some detail? I am newbie to this.

I read answers to the this question at why 2^29-1 is the biggest key in protocol buffers. But I am not clarified

Upvotes: 5

Views: 8710

Answers (4)

Chan Kim
Chan Kim

Reputation: 5979

this is another question rather a comment, in the document it says,

Field numbers in the range 16 through 2047 take two bytes. So you should reserve the numbers 1 through 15 for very frequently occurring message elements. Remember to leave some room for frequently occurring elements that might be added in the future.

Because for the first byte, top 5 bits are used for field number, and bottom 3 bits for field type, isn't it that field number from 31 (because zero is not used) to 2047 take two bytes? (and I also guess the second bytes' lower 3 bits are used also for field type.. I'm in the middle of reading it, so I'll fix it when I know it)

Upvotes: 0

user2269707
user2269707

Reputation:

Because of this line:

#define GOOGLE_PROTOBUF_WIRE_FORMAT_MAKE_TAG(FIELD_NUMBER, TYPE) \
  static_cast<uint32>((static_cast<uint32>(FIELD_NUMBER) << 3) | (TYPE))

this line create a "tag", which left only 29 (32 - 3) bits to save field indice.

Don't know why google use uint32 instead of uint64 though, since field number is a varint, may be they think 2^29-1 fields is large enough for a single message declaration.

Upvotes: 1

Maik
Maik

Reputation: 3549

Each field in an encoded protocol buffer has a header (called key or tag) prefixed to the actual encoded value. The encoding spec defines this key:

Each key in the streamed message is a varint with the value (field_number << 3) | wire_type – in other words, the last three bits of the number store the wire type.

Here the spec says the tag is a varint where the first 3 bits are used to encode the wire type. A varint could encode a 64 bit value, thus just by going on this definition the limit would be 2^61-1.

In addition to this, the Language Guide narrows this down to a 32 bit value at max.

The smallest field number you can specify is 1, and the largest is 2^29 - 1, or 536,870,911.

The reasons for this are not given. I can only speculate for the reasons behind this:

  1. Artificial limit as no one is expecting a message to have that many fields. Just think about fitting a message with that many fields into memory.

  2. As the key is a varint, it isn't simply the next 4 bytes in the raw buffer, rather a variable length of bytes (Java code reading a varint32). Each byte has 7 bit of actual data and 1 bit indicating if the end is reached. It cloud be that for performance reasons it was deemed to be better to limit the range.

  3. Since proto3 is the 3rd version of protocol buffers, it could be that either proto1 or proto2 defined the tag to be a varint32. To keep backwards compatibility this limit is still true in proto3 today.

Upvotes: 3

Marc Gravell
Marc Gravell

Reputation: 1063338

I suspect this is simply so that a field-header (wire-type and tag-number) can be decoded and handled as a 32-bit value. The wire-type is always the 3 least significant bits, leaving 29 bits for the tag number. Technically "varint" should support 64 bits, but it makes sense to limit it to reasonable numbers, not least because "varint" encoding means that larger numbers take more bytes to encode.

Edit: I realise now that this is similar to the linked post, but... it remain true! Each field in protobuf is prefixed by a "varint" that expresses what field (tag-number) follows, and what data type it is (wire-type). The latter is important especially so that unexpected fields (version differences) can be stored or skipped correctly. It is convenient for that field-header to be trivially processed by most frameworks, and most frameworks are fine with 32-bit integers.

Upvotes: 0

Related Questions