Saif Abid
Saif Abid

Reputation: 13

golang unix socket error. dial: resource temporarily unavailable

So I'm trying to use unix sockets with fluentd for a logging task and find that randomly, once in a while the error

dial: {socket_name} resource temporarily unavailable

Any ideas as to why this might be occurring?

I tried adding "retry" logic, to reduce the error, but it still occurs at times.

Also, for fluntd we are using the default config for unix sockets communication

func connect() {

var connection net.Conn
var err error
for i := 0; i < retry_count; i++ {
    connection, err = net.Dial("unix", path_to_socket)
    if err == nil {
        break

    }
    time.Sleep(time.Duration(math.Exp2(float64(retry_count))) * time.Millisecond)
}
if err != nil {
    fmt.Println(err)

} else {
        connection.Write(data_to_send_socket)

    }
     defer connection.Close()
}

Upvotes: 1

Views: 5199

Answers (2)

James Henstridge
James Henstridge

Reputation: 43899

Go creates its sockets in non-blocking mode, which means that certain system calls that would usually block instead. In most cases it transparently handles the EAGAIN error (what is indicated by the "resource temporarily unavailable" message) by waiting until the socket is ready to read/write. It doesn't seem to have this logic for the connect call in Dial though.

It is possible for connect to return EAGAIN when connecting to a UNIX domain socket if its listen queue has filled up. This will happen if clients are connecting to it faster than it is accepting them. Go should probably wait on the socket until it becomes connectable in this case and retry similar to what it does for Read/Write, but it doesn't seem to have that logic.

So your best bet would be to handle the error by waiting and retrying the Dial call. That, or work out why your server isn't accepting connections in a timely manner.

Upvotes: 2

Caleb
Caleb

Reputation: 9458

For the exponential backoff you can use this library: github.com/cenkalti/backoff. I think the way you have it now it always sleeps for the same amount of time.

For the network error you need to check if it's a temporary error or not. If it is then retry:

type TemporaryError interface {
    Temporary() bool
}

func dial() (conn net.Conn, err error) {
    backoff.Retry(func() error {
        conn, err = net.Dial("unix", "/tmp/ex.socket")
        if err != nil {
            // if this is a temporary error, then retry
            if terr, ok := err.(TemporaryError); ok && terr.Temporary() {
                return err
            }
        }
        // if we were successful, or there was a non-temporary error, fail
        return nil
    }, backoff.NewExponentialBackOff())
    return
}

Upvotes: 0

Related Questions