Aleksei Gutikov
Aleksei Gutikov

Reputation: 383

Why app can accept and read/write TCP connection in CLOSE_WAIT state (Linux)?

My assumption was that server can read/write from/to socket when TCP connection is in ESTABLISHED state. But I see that server actually can read from and write to socket when TCP connection is in CLOSE_WAIT state. It happens when client has closed connection on it's side, but server hasn't detected/handled end-of-stream case yet.

For example. Synchronous, blocking single-thread server:

#include <unistd.h>
#include <netdb.h>
#include <signal.h>

int main (void)
{
    signal(SIGPIPE, SIG_IGN); // ignore SIGPIPE from last send()
    int listen_sock = socket(PF_INET, SOCK_STREAM, 0);
    setsockopt(listen_sock, SOL_SOCKET, SO_REUSEADDR, &(int){1}, sizeof(int));
    struct sockaddr_in addr = {.sin_family = AF_INET, .sin_port = htons(8000), .sin_addr.s_addr = htonl(INADDR_ANY)};
    bind(listen_sock, (struct sockaddr *) &addr, sizeof(addr));
    listen(listen_sock, 10);
    while (1) {
        int sock = accept(listen_sock, 0, 0);
        char buffer[1024];
        recv(sock, buffer, sizeof(buffer), 0);
        sleep(1);
        send(sock, buffer, 10, 0);
        send(sock, buffer, 10, 0);
        sleep(1);
        send(sock, buffer, 10, 0);
        close(sock);
    }
}

Compile and run:

gcc server_short.c -o server_short && strace ./server_short

Watch connections on same host:

watch -n0.1 'sudo ss -lnt | grep 8000; sudo netstat -tpvn | grep 8000'

Run short-living clients on another host:

seq 20 | xargs -P100 -n1 bash -c 'echo -n "0123456789ABCDEF" | telnet $SERVER_IP 8000'

Then I see connections in CLOSE_WAIT and SYN_RECV states, SYN_RECV will turn into CLOSE_WAIT as queued connections will be processed by server:

LISTEN    11        10                 0.0.0.0:8000             0.0.0.0:*       
tcp       17      0 server_IP:8000       client_IP:33016        CLOSE_WAIT  -                   
tcp       17      0 server_IP:8000       client_IP:33030        CLOSE_WAIT  -                   
tcp       17      0 server_IP:8000       client_IP:33020        CLOSE_WAIT  -                   
tcp       17      0 server_IP:8000       client_IP:33028        CLOSE_WAIT  -                   
tcp        0      0 server_IP:8000       client_IP:33012        CLOSE_WAIT  21282/./server 
tcp        0      0 server_IP:8000       client_IP:33046        SYN_RECV    -                   
tcp        0      0 server_IP:8000       client_IP:33040        SYN_RECV    -                   
tcp        0      0 server_IP:8000       client_IP:33044        SYN_RECV    -                   
tcp        0      0 server_IP:8000       client_IP:33036        SYN_RECV    -                   
tcp       17      0 server_IP:8000       client_IP:33018        CLOSE_WAIT  -                   
tcp        0      0 server_IP:8000       client_IP:33048        SYN_RECV    -                   
tcp       17      0 server_IP:8000       client_IP:33034        CLOSE_WAIT  -                   
tcp       17      0 server_IP:8000       client_IP:33022        CLOSE_WAIT  -                   
tcp        0      0 server_IP:8000       client_IP:33042        SYN_RECV    -                   
tcp        0      0 server_IP:8000       client_IP:33038        SYN_RECV    -                   
tcp       17      0 server_IP:8000       client_IP:33032        CLOSE_WAIT  -                   
tcp       17      0 server_IP:8000       client_IP:33026        CLOSE_WAIT  -                   
tcp       17      0 server_IP:8000       client_IP:33014        CLOSE_WAIT  -                   
tcp       17      0 server_IP:8000       client_IP:33024        CLOSE_WAIT  -

And then server handles these CLOSE_WAIT connection one-by-one:

socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(3, {sa_family=AF_INET, sin_port=htons(8000), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
listen(3, 10)                           = 0
accept(3, NULL, NULL)                   = 4
recvfrom(4, "0123456789ABCDEF", 1024, 0, NULL, NULL) = 16
nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffdb9a2d0e0) = 0
sendto(4, "0123456789", 10, 0, NULL, 0) = 10
sendto(4, "0123456789", 10, 0, NULL, 0) = 10
nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffdb9a2d0e0) = 0
sendto(4, "0123456789", 10, 0, NULL, 0) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=22658, si_uid=1001} ---
close(4)                                = 0
accept(3, NULL, NULL)                   = 4
recvfrom(4, "0123456789ABCDEF", 1024, 0, NULL, NULL) = 16
nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffdb9a2d0e0) = 0
sendto(4, "0123456789", 10, 0, NULL, 0) = 10
sendto(4, "0123456789", 10, 0, NULL, 0) = 10
nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffdb9a2d0e0) = 0
sendto(4, "0123456789", 10, 0, NULL, 0) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=22658, si_uid=1001} ---
close(4)                                = 0
... AND SO ON

So, server can successfully accept, read request, process it and send response before detect end-of-stream by calling read or detect broken pipe by calling write.

I understand that this server code is not ideal, but even if you make it asynchronous (with select, epoll, boost::asio, whatever) you also will accept CLOSE_WAIT connections, read incoming request and kickoff processing of request before detect that connection is not alive. While server side was aware that connection has been closed by client before accepted by server.

So questions:

  1. Is this behaviour follows TCP specification? Is it legal to accept and read from CLOSE_WAIT connection?
  2. Why kernel allow accept CLOSE_WAIT connections? Or why recv not return error in this case? Who can even be interested in reading data from closed connection? What is the purpose of this behaviour? Seems that I do not see how this case can be used.
  3. How to detect this case and not start processing of request when 100% sure nobody will receive the response?

Upvotes: 1

Views: 421

Answers (1)

TabascoEye
TabascoEye

Reputation: 686

The TCP stack is independent of you user space application. When you set a socket into LISTEN, the TCP stack already accepts incoming connections (an incoming SYN is anserwed with SYN|ACK) and also stores incoming packets into the receive buffer.

When you program calls accept() it either blocks and waits for someone to connect to the socket or it returns the established socket that already existed form the TCP stack (or TCP state machine) point of view.

There are totally legit use cases where a client opens a connection, delivers some data and immediately closes the connection. The server then can take time to retrieve that data from the network buffers and handle a request (if that request does not need an answer to a client for example).

CLOSE_WAIT is inherently the state "TCP stack is waiting for the local application to close the socket as well", so your assumption that you can only read() when state is ESTABLISHED is wrong.

Upvotes: 2

Related Questions