Reputation: 2538

Blocking recv call hangs if server is down

Another socket problem.

In my client code, I am sending some packet and expectign some response from the server side:

send()

recv() <-- it is blocking

Immediately after send(), the server crashes and rebooted itself. In the meantime the recv() is waiting. But even after the server is up, the receive call is hanging. I have added SIGPIPE signal handling but its still not able to recognize that the socket is broken.

When i cancel the operation, i got the error from recv() that interrupt has been issued.

Anyone could help me how to rectify this error?

This is in a shared library running on Solaris machine.

Upvotes: 6

Answers (4)

Adil

Reputation: 2538

Another way to make the recv() call nono-blockign on Solaris is to use fcntl() to set the socket descriptor non-blocking:

fcntl(sockDesc, F_SETFL, O_NONBLOCK);

This can be used in along with select() to protect your recv() from faulty select() return value (in case if select() returns positive and there is no data on the socket).

Upvotes: 2

Todd Hayton

Reputation: 425

As others have mentioned, you can use select() to set a time limit for the socket to become readable.

By default, the socket will become readable when there's one or more bytes available in the socket receive buffer. I say "by default" because this amount is tunable by setting the socket receive buffer "low water mark" using the SO_RCVLOWAT socket option.

Below is a function you can use to determine if the socket is ready to be read within a specified time limit. It will return 1 if the socket has data available for reading. Otherwise, it will return 0 if it times out.

The code is based on an example from the book Unix Network Programming (www.unpbook.com) that can provide you with more information.

/* Wait for "timeout" seconds for the socket to become readable */
readable_timeout(int sock, int timeout)
{
    struct timeval tv;
    fd_set         rset;
    int            isready;

    FD_ZERO(&rset);
    FD_SET(sock, &rset);

    tv.tv_sec  = timeout;
    tv.tv_usec = 0;

 again:
    isready = select(sock+1, &rset, NULL, NULL, &tv);
    if (isready < 0) {
        if (errno == EINTR) goto again;
        perror("select"); _exit(1);
    }

    return isready;
}

Use it like this:

if (readable_timeout(sock, 5/*timeout*/)) {
    recv(sock, ...)

You mention handling SIGPIPE on the client side which is separate issue. If you are getting this is means your client is writing to the socket, even after having received a RST from the server. That is a separate issue from having a problem with a blocking call to recv().

The way that could arise is that the server crashes and reboots, losing its TCP state. Your client sends data to the server which sends back a RST, since it no longer has state for the connection. Your client ignores the RST and tries to send more data and it's this second send() which causes your program to receive the SIGPIPE signal.

What error were you getting from the call to recv()?

Upvotes: 4

Hans W

Reputation: 3891

The problem is that the connection is never actually closed. (No FIN packages are sent etc, the other end just goes away.)

What you want to do is set a timeout for recv'ing on the socket, using setsockopt(3) with SO_RCVTIMEO as option_name.

Upvotes: 3

Patrick

Reputation: 2405

May be you should set a timeout delay in order to manage this case. It can easily done by using setsockopt and setting SO_RECVTIMEO flag on your socket:

  struct timeval tv;
  tv.tv_sec = 30;
  tv.tv_usec = 0;
  if (setsockopt(socket_fd, SOL_SOCKET, SO_RCVTIMEO, (char *)&tv,  sizeof tv))
  {
    perror("setsockopt");
    return -1;
  }

Another possibility is to use non blocking sockets and manage read/write stuff with poll(2) or select(2). You should take a look on Beej's Guide to Network Programming.

Upvotes: 5

Blocking recv call hangs if server is down

recv() <-- it is blocking

Answers (4)

Related Questions