Reputation: 5519

How to separate TCP socket messages

I've experimented a bit with async TCP socket messages between two programs, for passing data, numbers and/or text. What I've done is to use a keyword in the start of each message, and then seperate the values with the "|" character. So a message may look like this:

"DATA|490|40517.9328222222|1|6|11345|11347|11344|11345|106|40517.8494212963"

I set the read buffer size to 1024, as most of the messages will be within that length. However sometimes I may send rapidly many short messages where several together are less than 1024 characters, and it seems then it will be read in one go. And if I send a message longer than 1024 characters, it will be split. So I'm looking for some advice on how to handle this. Should I use some special characters to start and/or end each message? Would appreciate some tips on how you do this..

Upvotes: 6

Answers (6)

MaximVdW

Reputation: 11

Protocol is everything. For my chat application I use the argument protocol like when you run

shutdown.exe -s -f -t 30

But then for sockets I use this

join John%20Doe            ' %20 for space
msg This%20Is%20a%20test   ' again %20 for space

This way it does not matter if your data is send ASYNC :D Hope this helps

Upvotes: 1

user207421

Reputation: 310893

There are several approaches.

A length word prefixed to each message.
An STX/ETX-style wrapping of each message so you can see where it starts and finishes. This requires escaping of ETX bytes that occur in the data, and that in turn requires escaping of ESC bytes too.
A self-describing protocol, for example XML, or a type-length-value based protocol.

Upvotes: 3

Cheeso

Reputation: 192467

The way TAR does it, is to use blocks of fixed size. Every block in TAR is 512 bytes, and the file (message) may be entirely contained within that one block. If it's not, the first 512 bytes includes a header that specifies how many additional blocks need to be read for that file (message).

Tar isn't a TCP app obviously, but it has similar data parsing or processing requirements.

Also Your size is smaller than 512 bytes, but maybe it makes sense to include a 64-byte block, or 128 or whatever, and ship all your data in packages of that size. you lose efficiency with the overhead of the "box size", but you may gain in efficiency and simplicity of the data processing algorithm.

Upvotes: 0

Cobra_Fast

Reputation: 16061

You could solve that problem by padding your messages with unique bytes (like 255 which doesn't appear in ASCII) to the buffer size and unpad them on the receiving end. To me this isn't a very nice and smart fix, but it actually works.

Or you could try to send the overall packet length at the beginning of each package, which is a littile more challanging and works more efficiently as the padding technique when done right. Merged packages would then look something like this (scheme):

05|.....02|..03|...

Upvotes: 0

terminus

Reputation: 14462

The easiest way would be to send the size of the message at the beginning of the packet. This way you'd know how much data to read. So it would look like:

00015MESSAGE|1|2 ...

It's important for the size field to have a fixed size.

You can also have this size field be binary, but it seems you are sending plain text so this way you'd have a humanly readable size field.

Upvotes: 4

Steve Townsend

Reputation: 54148

The simplest way would be to send the message length at the beginning of each message, serialized in such a way that it will work on little-endian and big-endian hardware.

This could help your receiver preallocate its receive buffer efficiently too.

Upvotes: 7

How to separate TCP socket messages

Answers (6)

Related Questions