James Crow
James Crow

Reputation: 200

TCP Socket Read Variable Length Data w/o Framing or Size Indicators

I am currently writing code to transfer data to a remote vendor. The transfer will take place over a TCP socket. The problem I have is the data is variable length and there are no framing or size markers. Sending the data is no problem, but I am unsure of the best way to handle the returned data.

The data is comprised of distinct "messages" but they do not have a fixed size. Each message has an 8 or 16 byte bitmap that indicates what components are included in this message. Some components are fixed length and some are variable. Each variable length component has a size prefix for that portion of the overall message.

When I first open the socket I will send over messages and each one should receive a response. When I begin reading data I should be at the start of a message. I will need to interpret the bitmap to know what message fields are included. As the data arrives I will have to validate that each field indicated by the bitmap is present and of the correct size.

Once I have read all of the first message, the next one starts. My concern is if the transmission gets cut partway through a message, how can I recover and correctly find the next message start?

I will have to simulate a connection failure and my code needs to automatically retry a set number of times before canceling that message.

I have no control over the code on the remote end and cannot get framing bytes or size prefixes added to the messages.

Best practices, design patterns, or ideas on the best way to handle this are all welcomed.

Upvotes: 1

Views: 3105

Answers (3)

Alnitak
Alnitak

Reputation: 339786

A TCP stream cannot be "cut" mid-message and then resumed.

If there is a short enough break in transmission then the O/S at each end will cope, and packets retransmitted as necessary, but that is invisible to the end user application - as far as it's concerned the stream is contiguous.

If the TCP connection does drop completely, both ends will have to re-open the connection. At that point, the transmitting system ought to start over at a new message boundary.

Upvotes: 1

Brian White
Brian White

Reputation: 8716

From a user's point of view, TCP is a stream of data, just like you might receive over a serial port. There are no packets and no markers.

A non-blocking read/recv call will return you what has currently arrived at which point you can parse that. If, while parsing, you run out of data before reaching the end of the message, read/recv more data and continue parsing. Rinse. Repeat. Note that you could get more bytes than needed for a specific message if another has followed on its heels.

A TCP stream will not lose or re-order bytes. A message will not get truncated unless the connection gets broken or the sender has a bug (e.g. was only able to write/send part and then never tried to write/send the rest). You cannot continue a TCP stream that is broken. You can only open a new one and start fresh.

Upvotes: 1

Jordan Denison
Jordan Denison

Reputation: 2727

For something like this you would probably have a lot easier of a time using a networking framework (like netty), or a different IO mechansim entirely, like Iteratee IO with Play 2.0.

Upvotes: 0

Related Questions