Gene Vincent
Gene Vincent

Reputation: 5469

Howto detect that a network cable has been unplugged in a TCP connection?

I have a C++ networking application that accepts TCP connections from clients and then waits on the socket until the client decides to send data (sometimes they won't send anything for a long time and thats OK).

It mostly detects error conditions when clients crash or machines are turned off, but it takes many minutes to notice when the network cable to the client has been unplugged and I would prefer it to notice this condition as soon as possible.

I don't have control over the clients and I can't make them send something like a "ping". My server does send out a "ping" packet to the to the clients (but they won't send a response), but even when the cable is unplugged write() returns the correct number of bytes (I see the TCP stack sending retry packets in Wireshark).

What is the best way to notice the loss of connection ? It would be most convenient if I could detect it on the write() call.

I need this to work on Windows and on Linux.

Upvotes: 4

Views: 9919

Answers (3)

er0
er0

Reputation: 1834

Unfortunately, there is no way to distinguish the cable being pulled out at the other end from any other reason for packet loss. Having said that, you can approximate loss of connectivity at the other end as "indefinite packet loss" occurring over a sufficiently long period of time (say T). TCP tracks packet loss, so the general approach for doing this would be:

  • Get the number of unacked bytes in the connection (say it's B)
  • send data, size = N
  • Set a timeout = T, when it fires, check the number of unacked bytes again. If it's B+N, then assume that the other side has lost connectivity. At this point, you could try ICMP echo to verify your assumption.

Getting TCP-specific information for a connection is not a standard interface on UNIX, and definitely not something portable to Windows. On Linux, there's a socket option called TCP_INFO, which you can call via getsockopt(). Google should give you some examples. I don't know if there's an equivalent option on Windows.

Another way to do this (i.e. approximate tracking of connectivity loss) is via RAW sockets. Open a RAW socket and filter it to receive only TCP traffic for your connection. Then rather than fetching information from TCP to determine if you are getting anything from the other end, simply wait to receive any packet from the other side. If you get something in the stipulated period, then it means the peer is still up.

Upvotes: 4

Ruben
Ruben

Reputation: 1437

Your question is quite complex. There are a lot of things that can go wrong between you and your client. More then just a 'unplugged' cable.

If you simply want to know if your user is still online you can build a new TCP connection. Because you need to complete the 3-way handshake to successfully build a TCP connection you know the client is online when the connection is successfully initialized. The problem with this is, that if you want to keep your current connection active you need an other port. Don't know if this is an issue in your case.

But by the sounds of it you are not really sending and receiving data from your client (apart from some ping data). So you can simply set your application in a loop to setup a TCP connection (the first 2 steps - so receiving the ACK - should be enough to determine if your client is still processing network data) every X amount of seconds. If you don't get a response in X milliseconds you can quite reliably say that either your client or something in between stopped "working".

Hope this helps. If not, please give some more info on what your tool is doing.

Upvotes: 1

Remy Lebeau
Remy Lebeau

Reputation: 598011

Sorry, but there is no way to detect an abnormal disconnection in a timely manner without pings/keepalives. Even the OS does not always know the cable has been pulled. That is why write() still works - the socket is happily buffering the data in its outgoing buffer, waiting to send it at a later time, because the socket state has not been invalidated by the OS yet. Eventually, the socket will time out internally, at which time the OS can finally invalidate the connection and let the socket report errors on subsequent operations. But that can take a long time, as you have noticed.

Since you cannot send application-layer pings, try enabling socket-layer keep-alives, at least. That might help. On Windows 2000+ only, you can use the SIO_KEEPALIVE_VALS socket option via WSAIoctl(), which lets you set the actual timer values of the keep-alives. On all platforms, you can use the SO_KEEPALIVE option via setsockopt(), but that does not let you configure the timer values, so defaults are used instead.

Upvotes: 3

Related Questions