surlac
surlac

Reputation: 2961

Keep-alive: dead peers detection

I run client and socket server written in Go (1.12) on macOS localhost.

Server sets SetKeepAlive and SetKeepAlivePeriod on net.TCPConn.
Client sends a packet and then closes connection (FIN) or client abruptly terminated.

Tcpdump shows that even after client closes the connection, server keeps sending keep-alive probes.
Shouldn't it detect that peer is "dead" and close the connection?

The question is generic, feel free to clarify if I'm missing some basics.

package main

import (
    "flag"
    "fmt"
    "net"
    "os"
    "time"
)

func main() {
    var client bool
    flag.BoolVar(&client, "client", false, "")
    flag.Parse()

    if client {
        fmt.Println("Client mode")
        conn, err := net.Dial("tcp", "127.0.0.1:12345")
        checkErr("Dial", err)

        written, err := conn.Write([]byte("howdy"))
        checkErr("Write", err)

        fmt.Printf("Written: %v\n", written)
        fmt.Println("Holding conn")

        time.Sleep(60 * time.Second)

        err = conn.Close()
        checkErr("Close", err)

        fmt.Println("Closed conn")

        return
    }

    fmt.Println("Server mode")
    l, err := net.Listen("tcp", "127.0.0.1:12345")
    checkErr("listen", err)
    defer l.Close()

    for {
        c, err := l.Accept()
        checkErr("accept", err)
        defer c.Close()

        tcpConn := c.(*net.TCPConn)
        err = tcpConn.SetKeepAlive(true)
        checkErr("SetKeepAlive", err)
        err = tcpConn.SetKeepAlivePeriod(5 * time.Second)
        checkErr("SetKeepAlivePeriod", err)

        b := make([]byte, 1024)

        n, err := c.Read(b)
        checkErr("read", err)

        fmt.Printf("Received: %v\n", string(b[:n]))
    }
}

func checkErr(location string, err error) {
    if err != nil {
        fmt.Printf("%v: %v\n", location, err)
        os.Exit(-1)
    }
}

Upvotes: 0

Views: 1990

Answers (1)

filipe
filipe

Reputation: 2047

The response to that question:

Sending keepalives is only necessary when you need the connection opened but idle. In that cases there is a risk that the connection is broken, so keep alive will try to detect broken connections.

If you had close the connection at server side with a proper con.Close() the keep alive would not be triggered (you did defer it to the end of the main function).

If you test your server code, it will start sending the keep alive after the timeout you set.

You notice that only after all keep alive proves (default 9 from kernel) and the time between the proves (8x), you get an io.EOF error on the server side Read (yes, the server stop sending)!

Currently the GO implementation is the same at Linux and OSX and it set both TCP_KEEPINTVL and TCP_KEEPIDLE to the value you pass to the setKeepAlivePeriod function, so, the behavior will depend of the kernel version.

func setKeepAlivePeriod(fd *netFD, d time.Duration) error {
    // The kernel expects seconds so round to next highest second.
    d += (time.Second - time.Nanosecond)
    secs := int(d.Seconds())
    if err := fd.pfd.SetsockoptInt(syscall.IPPROTO_TCP, syscall.TCP_KEEPINTVL, secs); err != nil {
        return wrapSyscallError("setsockopt", err)
    }
    err := fd.pfd.SetsockoptInt(syscall.IPPROTO_TCP, syscall.TCP_KEEPIDLE, secs)
    runtime.KeepAlive(fd)
    return wrapSyscallError("setsockopt", err)
}

There is a request opened since 2014 to provide a way to set keepalive time and interval separately.

Some references:

Upvotes: 3

Related Questions