Reputation: 14505
I created a number of TCP client / server applications in past, that were always using very simple text based protocols, either text messages separated by new line or Xml streams.
Now I am creating a simple protocol for a game server that would mostly exchange Vector information and handle some simple RPC calls with clients. Because it's going to be a game server I really want to it to be super fast and lightweight. For this reason I decided to implement it as a binary protocol (see Are binary protocols dead? for some more information on what I mean).
I have a rough idea how to do that, but before I start working on it I would like to confirm that this approach is actually going to work and if there isn't any better, commonly used one. I never implemented binary TCP protocols in past.
I am going to send all information in "packets of information" which I just call "datagrams". Another important constant within the protocol is the size of floating point numbers in bytes, which I just call BLOCK_SIZE
. I am going to use multiple languages (client will be C#, while server is C++) and I need to make sure that all platforms (x86, x64) will have same sized numbers within datagrams.
First information that server sends to client is going to be a single byte (sizeof(char)
) containing the value of BLOCK_SIZE
just to make sure that floating point numbers (I am using double
because it's big enough for my purposes) are going to take same amount of bytes on both server and client. Then a stream of "datagrams" follows until end of communication channel.
Layout of datagram is:
Size in bytes / name Description
BLOCK_SIZE / type Type of datagram (for internal purposes I need to figure out what I am actually going to process - I could probably create header instead that would describe it in details, but for my purposes one `double` can contain all possible types I will use)
unsigned BLOCK_SIZE / size Length of datagram - number of bytes that will follow. I just hope I will never need to send a single datagram that would be bigger tham maximum value of unsigned double :P
size / data Data that are contained within datagram, I will process them based on what `type` they will be.
I suppose that for receiving I would just always create a buffer big as size
announced by either side would be and keep pushing data into it, unless I get them all. And then I can start processing another datagram.
I believe this should work as long as client and server always send correct data with not a single byte being wrong, extra or missing. I don't know if TCP is reliable enough, or if I need to implement some error checking as well. I have no idea how.
I only need to transfer huge amount of arrays of floating point numbers, eventually some very simple strings, but I don't really need to care about that right away. Is this a right approach or should I do it in a different way?
Upvotes: 0
Views: 2466
Reputation: 2700
Do not use the word "datagram" if you are going to use TCP transport; a datagram is usually associated with UDP transfers.
TCP is reliable then you don't need extra CRCs or things like that.
you could implement i.e. a very simple binary stream protocol where every data unit is prepended by 2 bytes; the first one indicating its class and the second one indicating its length.
class length
unsigned byte 1 1
byte 2 1
unsigned short 3 2
short 4 2
unsigned int 5 4
int 6 4
...
array unsigned byte 1 1 * #elements
array byte 2 1 * #elements
array unsigned short 3 2 * #elements
array short 4 2 * #elements
array unsigned int 5 4 * #elements
array int 6 4 * #elements
...
the representation of the length variable (one byte) limits the total size of the data element to 255 bytes the representation of the class variable (one byte) limits the total # of classes to 255
remember when using TCP you must handle a "flow" of data that is independent of the size of the transmitted data units. You should not make assumptions about the size of the received packets; a data unit could be very well split in more than one TCP packet even if it fits on a single one
i.e. the sequence 0x01-0x01-0x33 is a data unit that could represent i.e the ASCII char "3"
you might think there is too much overhead, but transferring a single byte is the worst case scenario, the overhead gets smaller when the data unit gets bigger. Also consider this is the price to pay for not being dependent on a predefined higher level data structure.
Upvotes: 1