Reputation: 43
I read that protobuf has a type called "bytes" which can store arbitrary number of bytes and is the equivalent of "C++ string". The reason why I don't prefer to use "bytes" is that it expects input as a C++ string i.e., boost IP will need to be converted to a string. Now my concern lies here : I want to perform serialize and send the encoded protobuf message over TCP socket. I want to ensure that the encoded message size is as small as possible.
Currently, I am using the below .proto file :
syntax = "proto2";
message profile
{
repeated **uint32** localEndpoint = 1;
repeated **uint32** remoteEndpoint = 2;
}
In order to save boost IP in the protobuf message, I am first converting boost IP into byte-format array by using "boost::asio::ip::address_v4::to_bytes()". So for a v4 IP, resultant array size is 4. Then I am converting 1st 4 bytes from the resultant byte-array into one uint32_t number and then storing in "localEndpoint" field of the protobuf message. Likewise, I repeat for the next 4 bytes (for v6). I am taking 4 bytes at a time so as to utilize full 32 bits of the uint32.
Hence for a v4 address, 1 occurrence of "localEndpoint" field is used. Similarly, for a v6 address, 4 occurrence of "localEndpoint" field is used.
Please allow me to highlight that if I had used "bytes" here, my input string itself would have been of size 15 bytes for a v4 ip like 111.111.111.111
Using uint32 instead of "bytes" does save me some encoded-data-size but I am looking for a more efficient protobuf type requiring lesser number of bytes.
Sorry for a long description but I wanted to explain my query in details. Please help me.. Thanks a lot in advance :)
Upvotes: 3
Views: 5924
Reputation: 1063814
An ipv4 address should require exactly 4 bytes. If you're somehow getting 8, you're doing something wrong - are you perhaps hex-encoding it? You don't need that here. Likewise, ipv6 should be 16 bytes.
4 bytes with a usually-set high byte is most effectively stored as fixed32
- varint would be overhead here, due to the high bits. 16 bytes is more subtle - I'd go with bytes
(field header plus length), since it is simpler to form into a union, and if the field-number is large, it avoids having to pay for multiple multi-byte field headers (a length prefix of 16 will always be single-byte).
I'd then create a union of these via oneof
:
oneof ip_addr {
fixed32 v4 = 1;
bytes v6 = 2;
}
Upvotes: 9