BMBM
BMBM

Reputation: 16013

How synchronized are sockets if at all?

I already read this question about socket synchronization but I still dont get it yet.

Recently I was working on a relatively simple client/server app where the communication happens over a tcp socket. The client is written in PHP using the C-like functions (especially fsockopen and fgetc) PHP provides to interact with sockets, the server is written in node.js using a Stream for outputting data.

The protocol is quite simple, the message is just a string which ends with a 0-byte character.

Basically it works like this:

SERVER: Message 1
CLIENT: Ack 1

SERVER: Message 2
CLIENT: Ack 2

....

Which really worked fine as my client processed one message at a time by reading char by char from the socket until a 0-byte was encountered which designates the end of the message. Then the client writes back to the server that it has successfully received the message (thats the Ack <message id> part).

Now this happened:

SERVER: Message 1
CLIENT: Ack 1

SERVER: Message 2
CLIENT: Ack 2

SERVER: Message 3
        Message 4
        Message 5
        Message 6
CLIENT: <DOH!>
....

Meaning the server unexpectedly sent multiple messages in one "batch" to the client, although every message is a single stream.write(...) operation on the server. It seemed like the messages were buffered somewhere and then sent to the client at once. My client code couldnt cope with multiple messages in the socket WITHOUT an Ack response in between, so it cut off the remaining messages after id 3.

So my question is:

I tested this on a Ubuntu VM (so no load or anything that could provoke strange behaviour) using PHP 5.4 and node 0.6.x.

Upvotes: 2

Views: 3348

Answers (2)

PeterM
PeterM

Reputation: 31

There are two synchronization concepts to deal with:

  1. The (generally) synchronous operation of send() or recv().
  2. The asynchronous way that one process sends a message and the way the other process handles the message.

If you can, try to avoid a design that keeps a client and server in process-synchronized "lock step" with each other. That's asking for trouble. What if the one of the processes closes unexpectedly? The other process/thread might hang on a recv() that will never come. It's one thing for your design to expect each message to be acknowledged eventually, but it's quite another for your design to expect that only one message can be sent, then it must be acknowledged, before you may send another.

Consider this:

Server: send 1
Client: ack 1
Server: send 2
Server: send 3
Client: ack 2
Server: send 4
Client: ack 3
Client: ack 4

A design that can accommodate this situation is better than one that expects:

Server: send 1
Client: ack 1
Server: send 2
Client: ack 2
Server: send 3
Client: ack 3
Server: send 4
Client: ack 4

Upvotes: 0

Ambroz Bizjak
Ambroz Bizjak

Reputation: 8095

TCP is an abstraction of a bi-directional stream, and as such has no concept of messages and cannot preserve message boundaries. There is no guarantee how multiple send() or recv() calls will map to TCP packets. You should treat send() as if calling it multiple times is equivalent to calling it once with the concatenation of all the data. More importantly, when receiving, you should make sure that your code interprets the incoming data exactly the same way, no matter how it was split over indvidual recv() calls.

To receive properly, you can use a buffer where you store incomplete messages. But be careful that when you have an incomplete message in a buffer, the next recv() call may complete the current message, as well as provide zero or more complete messages, and possibly part of another incomplete message.

The blocking or non-blocking mode doesn't change anything here - it's only about the way your application interfaces with the OS.

Upvotes: 4

Related Questions