Reputation: 7359
I'm trying to design my first file format in ProtoBuf, and I'm not sure what is the best choice in some cases, because the memory/stream layout is not totally clear to me.
So I have in fact several questions, but all closely related:
1) What does an optional field cost, when it is omitted?
I think it should only cost one bit, since a bit-field can be used to flag present/absent fields, but I don't know for sure. They might instead use a whole byte per optional field.
2) What does a repeated field cost when it is empty? Is it also one bit, like the optional field, or is it "field header" + one (varint) byte to say it is size 0?
3) Since "bytes" implicitly has a size, is there actually a size difference between a missing optional bytes field, and an empty required bytes field?
[EDIT] By "memory" I meant space used on the file-system or network bandwidth; I did not mean RAM, since this would be programming-language-dependent.
Upvotes: 13
Views: 5568
Reputation: 1063358
1: nothing whatsoever - it is omitted completely on the wire
2: nothing whatsoever - only actual contents are included; an empty list is essentially omitted (possible exception: empty "packed" arrays; although even that could legitimately be omitted)
3: omitted costs nothing; present and zero-length costs at least 2 bytes - one field header (length depends on field number; low field numbers < 16 take 1 byte), and one length of zero (one byte)
Additional note: protobuf never uses sub-byte packing, so any field always uses an entire number of bytes.
(context: I've written a protobuf implementation from first principles, so the encoding details are fairly familiar to me)
Upvotes: 19