user1630946
user1630946

Reputation: 41

How Protobuf encodes oneof message construct

For this python program running protobuf encoding when encoded gives following output:

0a 10 08 7f 8a 01 04 08 02 10 03 92 01 04 08 02 10 03 18 01

What I don't understand is why there is a 01 after 8a, and again why 01 after 92. Seems for information element of type oneof, extra 01 is added, but why? Please help me, if any body understands protobuf encoding

import sys
import test2_pb2

def write_to_file(file,value):
    with open(file, "wb") as f:
        f.write(value)

def cell_test_dct():

  msg=test2_pb2.TestPrimM();
  msg.textx=1
  msg.testmsg.testint=127
  msg.testmsg.prbbundlingtype.x =2
  msg.testmsg.prbbundlingtype.static=
   test2_pb2.BUNDLE_SIZE_N_4_WIDEBAND
  msg.testmsg.bundlingtype.x =2
  msg.testmsg.bundlingtype.static=
  test2_pb2.BUNDLE_SIZE_N_4_WIDEBAND
  print (msg)
  str = msg.SerializeToString()
  #write_to_file("/tmp/protobuf_test_file.bin",str)

def main_test():
    cell_test_dct()

main_test()

For the following protobuf file :

package NR_TEST;

enum BundleSizeE {
    BUNDLE_SIZE_N_4 = 0;
    BUNDLE_SIZE_WIDEBAND = 1;
    BUNDLE_SIZE_N_2_WIDEBAND = 2;
    BUNDLE_SIZE_N_4_WIDEBAND = 3;
}

message DynamicBundleSizesM {
    // bundleSizeSet1
    optional BundleSizeE bundleSizeSet1 = 1;
    // bundleSizeSet2
    optional BundleSizeE bundleSizeSet2 = 2;
}

message PrbBundlingTypeM {
    optional uint32 x=1;
    oneof BundlingTypeC {
    BundleSizeE static = 2;
    DynamicBundleSizesM dynamic = 3;
    }
}

message Test {
    required int32 testint   =1;
    required PrbBundlingTypeM prbbundlingtype = 17;
    required PrbBundlingTypeM bundlingtype = 18;
}
message TestPrimM
{
  oneof TestMsgC {
    Test testmsg=1;
    int32  nomsg=2;
    }
    required  int32 textx=3;
}

Upvotes: 2

Views: 2440

Answers (1)

vlp
vlp

Reputation: 8116

Given protobuf encoding your message is as follows:

0a .. varint key '1|010' -> field 1, type LENGTH_DELIMITED
    10 .. varint length -> 16
    Contents:
        08 .. varint key '1|000' -> field 1, type VARINT
            7f .. varint value -> 127
        8a 01 .. varint key '00000010001|010' -> field 17, type LENGTH_DELIMITED
            04 .. varint length -> 4
            Contents:
                08 .. varint key '1|000' -> field 1, type VARINT
                    02 .. varint value -> 2
                10 .. varint key '10|000' -> field 2, type VARINT
                    03 .. varint value -> 3
        92 01 .. varint key '00000010010|010' -> field 18, type LENGTH_DELIMITED
            04 .. varint length -> 4
            Contents:
                08 .. varint key '1|000' -> field 1, type VARINT
                    02 .. varint value -> 2
                10 .. varint key '10|000' -> field 2, type VARINT
                    03 .. varint value -> 3
18 .. varint key '11|000' field 3, type VARINT
    01 .. varint value -> 1

Integer value for tag of length-delimited field 17 is in binary: 10001|010 (10001 is 17 and 010 is length-delimited wire type) -> giving 10001010 (binary).

To encode this number as varint you need to adjust total bit length to a multiple of 7 (fill with zeroes):

-> 00000010001010

Then split it into groups of 7 bits:

-> 0000001 0001010

Then reverse the order of those groups:

-> 0001010 0000001

And add an extra bit to each group (MSB) -- zero to the last group and one to all other groups (MSB equal to 1 tells parser that there is another group following):

-> 10001010 00000001

Which gives 0x8A 0x01 in hexadecimal (your value).

Varint encoding is decribed here.


As far as I know the oneof construct does not change the wire format (it only extends parser logic that it ignores all, but the last field from a single oneof group).

Good luck!

Upvotes: 2

Related Questions