Rushi Kumar
Rushi Kumar

Reputation: 68

Parser for TCP buffers

I want to implement a protocol to share the data between server and client. I don't know the correct one. By keeping performance as main criteria can anyone suggest the best protocol to parse the data.

I have one in mind, don't the actual name but it will be like the this

[Header][Message][Header][Message]


the header contains the length of the message and the header size is fixed. I have tried this with some by performing a lot of concatenation and substring operation which are costlier. can any suggest the best implementation for this

Upvotes: 2

Views: 806

Answers (2)

Goswin von Brederlow
Goswin von Brederlow

Reputation: 12322

For parsing there are two common solutions:

  1. small messages

Receive the data into a buffer, e.g. 64k. Then use pointers into that buffer to parse the header and message. Since the messages are small there can be many messages in the buffer and you would call the parser again as long as there is data in the buffer. Note that the last message in the buffer might be truncated. In which case you have to keep the partial message and read more data into the buffer. If the message is near the end of the buffer then copying it to the front might be necessary.

  1. large messages

With large messages it makes sense to first only read the header. Then parse the header to get the message size, allocate an appropriate buffer for the message and then read the whole message into it before parsing it.

Note: In both cases you might want to handle overly large messages by either skipping them or terminating the connection with an error. For the first case a message can not be larger than the buffer and should be a lot smaller. For the second case you don't want to allocate e.g. a gigabyte to buffer a message if they are supposed to be around 1MB only.

For sending messages it's best to first collect all the output. A std::vector can be sufficient. Or a rope of strings. Avoid copying the message into a larger buffer over and over. At most copy it once at the end when you have all the pieces. Using writev() to write a list of buffers instead of copying them all into one buffer can also be a solution.

As for the best protocol... What's best? Simply sending the data in binary format is fastest but will break when you have different architectures or versions. Something like google protobuffers can solve that but at the cost of some speed. It all depends on what your needs are.

Upvotes: 1

sehe
sehe

Reputation: 392853

The question is very broad.

On the topic of avoiding buffer/string concatenation, like at Buffer Sequences, described in Boost Asio's "Scatter-Gather" Documentation

Upvotes: 1

Related Questions