Nomada
Nomada

Reputation: 373

Protobuf.net How to deserialize array of int (packed data)

We are using protobuf to make several systems inter-communicate. Some of them use protobuf.net, some use other languages and libraries. Our application-level messages consist of a header and a body.

In the .net world we have:

public class ProtobufMessage
{
    public IList<byte[]> Frames { get; }
}
    
RuntimeModel = RuntimeTypeModel.Create();
RuntimeModel.Add(typeof(ProtobufMessage), false).Add(1, nameof(ProtobufMessage.Frames));

(false due to message: This operation is not supported for tuple - like types. To disable tuple - like type discovery, use applyDefaultBehaviour: false when first adding the type to the model.)

When a message arrives we take the header Frame, deserialize it, and now we can get the type of the body to deserialize the next Frame (byte[]):

byte[] bytes; // Body Frame
using var memory = new MemoryStream(bytes);
return typeModel.Deserialize<T>(memory); // T: type from the Header frame

When the data type is a class all is working fine.
But sometimes, the body data are just repeated int32 so, no proto contract added to RuntimeModel.

lets say we are sending [1,2,3] as body (tag 1) when the data arrives as: 080108020803 (unpacked) body frame is decoded ok
In proto3, repeated fields of scalar numeric types use packed encoding by default.

But when protobuf-net tries to deserialize: 0a03010203 throws an Exception: Invalid wire-data (String)

Are we doing something wrong? How to tell protobuf incoming data is packed when there is no class to decorate?

Upvotes: 0

Views: 298

Answers (1)

bazza
bazza

Reputation: 8414

It feels a bit like GPB is not being used properly. You do not need in GPB to send first a "header" message that tells you what the next message is going to be; GPB provides for this kind of "what's next?" situation; that's what a oneof is for. The keyword oneof is kind of self describing; it contain "one of the following".

If there's a range of message types that can be sent from one place to another, the best thing to do is to combine all of those into a single message with a oneof field.

message A
{
..
}

message B
{
..
}

message MessageWrapper
{
    oneof msg
    {
        A a = 1;
        B b = 2;
    }
}

The idea is that all you ever serialise / send / receive / deserialise is MessageWrappers, and having deserialised it you ask the MessageWrapper object what variant of the oneof it contains.

A side effect is that you have a class, for any message type.

Regarding what to do with repeated int32, I think a lot can be learned by examining what happens if you compile a .proto file describing your messages using protoc. You don't end up with stand alone scalar types, you end up with a c# class with scalar types as members. That's a pretty big hint, I think.

Also, is there any particular reason to write your project code-first, build at runtime? You say you have multiple systems in several langauges. The easiest thing to do, indeed one of the specific reasons Google created GPB in the first place, is to have one single .proto file that you compile to c#, java, c++, whatever it is that your project needs. The .proto file is a "single point of truth" for the whole project, all members of the project take that, compile it, guaranteeing successful interoperability. The approach you appear to be taking - writing classes in C# and decorating them so that they can be serialised to GPB wire format - means

  • repeated work - you have to do this in the other languages too
  • a risk of misunderstanding - you might not write a class in exactly the same way as someone else in another language
  • you cannot easily update the content of messages being passed between systems, because everyone has to re-write to do so.

Far better to have a single .proto file, and compile it.

Upvotes: 2

Related Questions