Reputation: 39550
I've worked with a few protocols, and written my own. I have written some message formats with only 1 char to identify the message, and some with 4 chars. I don't feel that I'm experienced enough to tell which is better, so I'm looking for an answer which describes in which scenario one might be better than the other.
For performance, you would imagine that sending 2 bytes (A%1i
) is faster than sending 5 bytes (ABCD%1i
). However, I have noticed that when writing the protocol with the 1 byte prefix, if you have a bug which causes your code to not read enough data from the socket, you might get garbage data comming into your system.
So is the purpose of a 4 byte prefix just to provide a guarentee that your message is clean? Is it worth it for the performance you sacrafice? Do you really sacrafice any performance at all? Maybe it's better to have 2 or 3 byte prefix?
I'm not sure if this question should be specific to TCP, or whether it applies to all transport protocols. Advice on this would be interesting.
Update: For interest, I will mention that Synergy uses 4-byte message prefixes, so for a mouse move delta the header is the same size as the actual data. Some have suggested just having a 1 or 2 byte prefix to improve efficiency. I wonder what drawbacks this would have?
Update: Also, I wonder if only the handshake really matters, if you're worried about garbage data. Synergy has a long handshake (a few bytes), so are the 4-byte message prefixes needed? I made a protocol recently that has only a 1 byte handshake, and that turned out to be a bad idea, since incompatible protocols were spamming the system with bad data (off the back of this, I might reccomend at least having a long handshake).
Upvotes: 4
Views: 401
Reputation: 5490
The purpose of the header is to make it easier to solve the frame synchronization problem ( byte aligning in serial communication ). To synchronize, the receiver looks for anything in the data stream that "looks like" a start-of-message header. If you have lots of different kinds of valid start-of-message headers, and all of them are 1 byte long, then you will inevitably get a lot of "false frame synchronizations" -- garbage from something that "looks like" a start-of-message header, but isn't. It would be better to pick some other header that makes it "unlikely" that anything in the serial data stream "looks like" a valid start-of-message header.
It is inevitable that you will get garbage data coming into your system, no matter how you design the packet header. Whatever you use to handle these other problems (such as occasional bit errors in the middle of the message) should also be adequate to handle the occasional "false frame synchronization" garbage. In some systems, any bad data is quickly overwritten by fresh new good data, and if you blink you might never see the bad data. Other systems need at least some sort of error detection in the footer to reject the bad data. Yet other systems need to not only detect such errors, but somehow keep re-sending that message -- until both sides are convinced that an error-free version of that message has been successfully received.
As Oleksi implied, in some systems the latency is not significantly different between sending a single binary bit (100 ms) and sending 10 bytes (102.4 ms). So the advantages of using a tiny header (2.4% less latency!) may not be worth it compared to the advantages of using a more verbose header (easier debugging; easier to make backward-compatible and forward-compatible; easier to test the effect of minor changes "in isolation" without upgrading both sides in lockstep to the new protocol which is completely incompatible with the old protocol).
Perhaps you could get the best of both worlds by (a) keeping the verbose, easy-to-debug headers on messages that are so rarely used that the effect of tiny headers is too small to measure (which I suspect is nearly all messages), and (b) introducing a "tiny header" format for any kind of message where the effect of tiny headers is "noticeably better" or at least at least measurable. It looks like the Synergy protocol is flexible enough to add such a "tiny header" format in a way that is easily distinguishable from the other kinds of message headers.
I use Synergy between my laptop and a few desktop machines. I am glad someone is trying to make it even better.
Upvotes: 1
Reputation: 13097
The performance will depend on the content of the message you are sending. If your content is several kilobytes, it doesn't really matter how many bytes your header is. For now, I would choose the scheme that's easiest to work with, because the performance difference between sending one byte, or four bytes is going to be negligible compared to the actual data that you're sending.
Upvotes: 0