Yongqi Z
Yongqi Z

Reputation: 641

Different wg.Done() causes WaitGroup is reused before previous Wait has returned

I am testing sync.WaitGroup, if I put defer wg.Done() in the begining of the function, like this:

package main

import (
        "fmt"
        "sync"
        "time"
)

func worker(wg *sync.WaitGroup, id int) error {
    defer wg.Done() // put here cause error
    fmt.Printf("Worker %v: Finished\n", id)
    if true {
        return nil
    }

    return nil
}

var wg sync.WaitGroup // I should put `wg` outside of this function
func callWorker(i int){
    fmt.Println("Main: Starting worker", i)
    fmt.Printf("Worker %v: Finished\n", id)
    wg.Add(1)
    go worker(&wg, i)
    wg.Wait()
}

func main() {
    for i := 0; i < 1000; i++ {
        go callWorker(i)
    }
    time.Sleep(time.Second * 60)
    fmt.Println("Main: Waiting for workers to finish")

    fmt.Println("Main: Completed")
}

I will get WaitGroup is reused before previous Wait has returned in some cases, like this

enter image description here

but if I put defer wg.Done() in the end of function, it runs successfully, why?

func worker(wg *sync.WaitGroup, id int) error {
    
    fmt.Printf("Worker %v: Finished\n", id)
    if true {
        return nil
    }
    defer wg.Done() // put here, it is ok
    return nil
}

Upvotes: 1

Views: 1687

Answers (2)

volkit
volkit

Reputation: 1611

The problem is that if Wait() is called on a waitgroup, it is not allowed to reuse this waitgroup and call Add() on it again until the Wait() call is returned (see docs). In this programm the callWorker function is itself a go routine and all callWorker functions are running concurrently. Without waiting for each other they try to call Add() while a previous Wait() call isn't finished.

The first workers yield results without error because their Wait() call is luckily returned before the next Add() call, a classical race condition.

If you want the workers to run concurrently you have to move Wait() and Add() out of the callWorker function. Wait() must be called after the for-loop. And Add() should be called inside the loop before callWorker, otherwise the program will finish before callWorker has a chance to add something to the waitgroup and hence Wait() has nothing to wait for. There is no need for time.Sleep in main(), too.

func main() {
    for i := 0; i < 1000; i++ {
        wg.Add(1)
        go callWorker(i)
    }
    wg.Wait()
    fmt.Println("Main: Waiting for workers to finish")

    fmt.Println("Main: Completed")
}

Upvotes: 0

Sean F
Sean F

Reputation: 4605

The docs state that "If a WaitGroup is reused to wait for several independent sets of events, new Add calls must happen after all previous Wait calls have returned"

You are calling wg.Done() in some goroutines before calling wg.Add(1) in others, which is not allowed, as stated by the docs. You need to call wg.Add before you start all those goroutines, and you might as well just call it once, wg.Add(1000)

The reason your other code works is that it never calls wg.Done(), you have

if true {
     return nil
}
defer wg.Done()

so you always return without reaching the defer statement, so there are never any calls to wg.Done().

Do this:

func callWorker(i int){
    fmt.Println("Main: Starting worker", i)
    // you cannot call Add here because Done has been called in other goroutines
    go worker(&wg, i)
    wg.Wait()
}

func main() {
    wg.Add(1000) // <---- You must call Add before Done is called in any goroutine
    for i := 0; i < 1000; i++ {
        go callWorker(i)
    }
    time.Sleep(time.Second * 60)
    fmt.Println("Main: Completed")
}

Upvotes: 3

Related Questions