How can I deserialize incoming data on the TCP server?

Question

I've set up a server reusing the code found in the documentation where I have self.data = self.request.recv(1024).strip().

But how do I go from this, deserialize it to protobuf message (Message.proto/Message_pb2.py). Right now it seems that it's receiving chunks of 1024 bytes, and that more then one at the time... making it all rubbish :D

Marc Gravell · Accepted Answer

TCP is typically just a stream of data. Just because you sent each packet as a unit, doesn't mean the receiver gets that. Large messages may be split into multiple packets; small messages may be combined into a single packet.

The only way to interpret multiple messages over TCP is with some kind of "framing". With text-based protocols, a CR/LF/CRLF/zero-byte might signify the end of each frame, but that won't work with binary protocols like protobuf. In such cases, the most common approach is to simply prepend each message with the length, for example in a fixed-size (4 bytes?) network-byte-order chunk. Then the payload. In the case of protobuf, the API for your platform may also provide a mechanism to write the length as a "varint".

Then, reading is a matter of:

read an entire length-header
read (and buffer) that many bytes
process the buffered data
rinse and repeat

But keeping in mind that you might have (in a single packet) the end of one message, 2 complete messages, and the start of another message (maybe half of the length-header, just to make it interesting). So: keeping track of exactly what you are reading at any point becomes paramount.

How can I deserialize incoming data on the TCP server?

Answers (1)

Related Questions