majic bunnie
majic bunnie

Reputation: 1405

Proper technique for sending multiple pieces of data of arbitrary length over TCP socket

I am curious as to the proper method for sending multiple pieces of data of arbitrary length over a socket in C. For example, if one were to send a "username" of arbitrary length, a "subject" of arbitrary length and a "message" of arbitrary length what would the correct process be for sending these. Also, the data I am attempting to send may not necessarily be null-terminated so I don't believe I could reassemble it correctly based solely on null-bytes.

The method I have come up with would involve reading the first 4 bytes of the input received on the server and interpreting that as the size of the first piece of data and read that amount of data from the socket and interpret it as the first string, read 4 more bytes and interpret that as the length of the second string then read exactly that many bytes and interpret it as the second string and so on. However, this seems like it could be error-prone or have some implementation details that could cause things to go awry. Is there a better way to accomplish this?

Upvotes: 2

Views: 1563

Answers (3)

Jeremy Friesner
Jeremy Friesner

Reputation: 73294

In order to avoid having your send/receive routines become insanely complicated as your data structures grow more elaborate, I recommend splitting the problem into separate steps:

  1. Write routines that can frame and send an arbitrary buffer of N bytes over the TCP connection. (This would involve sending the 4-byte length prefix, and then sending the N bytes, as you described). You might also want to include a 4-byte type-code header that the receiver can use to easily identify which of your data structures the received bytes are intended to represent.

  2. Write a routine that converts (your favorite data structure) into a series of N bytes that are held in RAM. Then write an associated routine that de-converts a series of N bytes held in RAM back into (your favorite data structure).

  3. Repeat step (2) for any other data structures you want to send over the wire. Note that once you have multiple data types, the type-code field mentioned in (1) will make it easier for the receiver to know which de-convert routine to call after it has completely received a byte-buffer.

Once you've done the above, you can use your general-purpose send/receive-byte-buffers code from (1) to transport any of your data structures from (2), and thus you don't have to write separate send/receive code for every data structure, which is a big win.

Note that if you are worried about portability, you'll need to make sure to convert any multibyte integer or float values you want to send to big-endian (or little-endian, doesn't matter as long as you are consistent) before you send them, and then de-convert them back to the native endian on the receiver after you receiver them (but before you use them for anything). (You'll also need to avoid the temptation of just memcpy()'ing C structs into a byte-buffer, since different platforms and even different compiler versions may pad out the C structs differently in memory, which would lead to disastrous results if your sender and receiver aren't running the exact same executable)

Upvotes: 1

Dan
Dan

Reputation: 340

Your proposal is one way to solve the problem. It's not "error prone" if the socket is using a reliable transport like TCP, since the network layer guarantees that the data will be delivered uncorrupted and in the right order. Another way to do it would be to send fixed-size structures so the receiver will always read X bytes and know it has received a complete message. Another is to use a field terminator, either a NUL (as you suggest) or a newline character; this is what many Internet protocols do (e.g. HTTP, FTP). Yet another is to use a serialization method (as another answer suggests). It all depends on what kind of data you're planning on sending and how portable the data needs to be between different types of systems.

Upvotes: 2

vanza
vanza

Reputation: 9903

You want to read about serialization. A few links that may be useful:

http://en.wikipedia.org/wiki/Serialization

http://en.wikipedia.org/wiki/External_Data_Representation

http://en.wikipedia.org/wiki/Protocol_buffers

And many others. Just pick your poison. Or write your own as you seem to be trying, as a learning exercise.

Upvotes: 2

Related Questions