Dai
Dai

Reputation: 155428

How do I timeout a blocking external library call?

(I don't believe my issue is a duplicate of this QA: go routine blocking the others one, because I'm running Go 1.9 which has the preemptive scheduler whereas that question was asked for Go 1.2).

My Go program calls into a C library wrapped by another Go-lang library that makes a blocking call that can last over 60 seconds. I want to add a timeout so it returns in 3 seconds:

Old code with long block:

// InvokeSomething is part of a Go wrapper library that calls the C library read_something function. I cannot change this code.

func InvokeSomething() ([]Something, error)  {
    ret := clib.read_something(&input) // this can block for 60 seconds
    if ret.Code > 1 {
        return nil, CreateError(ret)
    }
    return ret.Something, nil
}

// This is my code I can change:

func MyCode() {

    something, err := InvokeSomething()
    // etc
}

My code with a go-routine, channel, and timeout, based on this Go example: https://gobyexample.com/timeouts

type somethingResult struct {
    Something []Something
    Err        error
}

func MyCodeWithTimeout() {
    ch = make(chan somethingResult, 1);
    go func() {
        something, err := InvokeSomething() // blocks here for 60 seconds
        ret := somethingResult{ something, err }
        ch <- ret
    }()
    select {
    case result := <-ch:
        // etc
    case <-time.After(time.Second *3):
        // report timeout
    }
}

However when I run MyCodeWithTimeout it still takes 60 seconds before it executes the case <-time.After(time.Second * 3) block.

I know that attempting to read from an unbuffered channel with nothing in it will block, but I created the channel with a buffered size of 1 so as far as I can tell I'm doing it correctly. I'm surprised the Go scheduler isn't preempting my goroutine, or does that depend on execution being in go-lang code and not an external native library?

Update:

I read that the Go-scheduler, at least in 2015, is actually "semi-preemptive" and it doesn't preempt OS threads that are in "external code": https://github.com/golang/go/issues/11462

you can think of the Go scheduler as being partially preemptive. It's by no means fully cooperative, since user code generally has no control over scheduling points, but it's also not able to preempt at arbitrary points

I heard that runtime.LockOSThread() might help, so I changed the function to this:

func MyCodeWithTimeout() {
    ch = make(chan somethingResult, 1);
    defer close(ch)
    go func() {
        runtime.LockOSThread()
        defer runtime.UnlockOSThread()
        something, err := InvokeSomething() // blocks here for 60 seconds
        ret := somethingResult{ something, err }
        ch <- ret
    }()
    select {
    case result := <-ch:
        // etc
    case <-time.After(time.Second *3):
        // report timeout
    }
}

...however it didn't help at all and it still blocks for 60 seconds.

Upvotes: 3

Views: 1279

Answers (1)

icza
icza

Reputation: 418107

Your proposed solution to do thread locking in the goroutine started in MyCodeWithTimeout() does not give guarantee MyCodeWithTimeout() will return after 3 seconds, and the reason for this is that first: no guarantee that the started goroutine will get scheduled and reach the point to lock the thread to the goroutine, and second: because even if the external command or syscall gets called and returns within 3 seconds, there is no guarantee that the other goroutine running MyCodeWithTimeout() will get scheduled to receive the result.

Instead do the thread locking in MyCodeWithTimeout(), not in the goroutine it starts:

func MyCodeWithTimeout() {
    runtime.LockOSThread()
    defer runtime.UnlockOSThread()

    ch = make(chan somethingResult, 1);
    defer close(ch)
    go func() {
        something, err := InvokeSomething() // blocks here for 60 seconds
        ret := somethingResult{ something, err }
        ch <- ret
    }()
    select {
    case result := <-ch:
        // etc
    case <-time.After(time.Second *3):
        // report timeout
    }
}

Now if MyCodeWithTimeout() execution starts, it will lock the goroutine to the OS thread, and you can be sure that this goroutine will notice the value sent on the timers value.

NOTE: This is better if you want it to return within 3 seconds, but this sill will not give guarantee, as the timer that fires (sends a value on its channel) runs in its own goroutine, and this thread locking has no effect on the scheduling of that goroutine.

If you want guarantee, you can't rely on other goroutines giving the "exit" signal, you can only rely on this happening in your goroutine running the MyCodeWithTimeout() function (because since you did thread locking, you can be sure it gets scheduled).

An "ugly" solution which spins up CPU usage for a given CPU core would be:

for end := time.Now().Add(time.Second * 3); time.Now().Before(end); {
    // Do non-blocking check:
    select {
    case result := <-ch:
        // Process result
    default: // Must have default to be non-blocking
    }
}

Note that the "urge" of using time.Sleep() in this loop would take away the guarantee, as time.Sleep() may use goroutines in its implementation and certainly does not guarantee to return exactly after the given duration.

Also note that if you have 8 CPU cores and runtime.GOMAXPROCS(0) returns 8 for you, and your goroutines are still "starving", this may be a temporary solution, but you still have more serious problems using Go's concurrency primitives in your app (or a lack of using them), and you should investigate and "fix" those. Locking a thread to a goroutine may even make it worse for the rest of the goroutines.

Upvotes: 2

Related Questions