Reputation: 599
What is the approach to correctly encoding and parsing variable length messages over TCP? Ex suppose we want to send a message which consists of a mix of string texts and a binary file.
Upvotes: 0
Views: 603
Reputation: 764
Just adding to the excellent answer already posted in this thread.
If the purpose of your question is education, you could take a look at RFC 6455, Section 5.2 how messages are framed in the WebSocket protocol.
If you need to implement communication over TCP, you could also save time by reusing existing RPC protocols, such as gRPC, Apache Thrift, XML-RPC, JSON-RPC, and many others. The mentioned WebSocket protocol can also be used by non-browser clients (which might be a good option if you are planning to expand functionality to the browser in the future).
Most MessagePack libraries would also let you deserialize a MessagePack-encoded from a stream without providing its length, so you could simply talk by sending and receiving MessagePack messages over your socket.
Upvotes: 1
Reputation: 182819
It depends on the protocol you're implementing on top of TCP. Its specification will tell you the correct approach to use.
If you're designing the protocol, generally you just follow the design of whatever existing protocol is closest to what you're doing. Common schemes include:
You encode each message as text ending with a newline character. The receiver just reads blocks of data and searches them for newline characters.
You encode each message as a variable length block and send a 4-byte integer length (in network byte order) prior to each block. The receiver reads blocks of data, when it has 4 bytes, it determines the length of the message, when it has that many more bytes, it "snips off" the message and parses any leftover.
You encode a message in a format like XML or JSON.
Upvotes: 1