Reputation: 1265
I'm trying to understand a code example which represents multiple readers and writers in Go.
This code example is used to calculate the size(s) of a webpage/webpages.
Code version 1:
package main
import (
"fmt"
"io/ioutil"
"net/http"
)
func main() {
urls := []string{"http://google.com", "http://yahoo.com", "http://reddit.com"}
sizeCh := make(chan string)
urlCh := make(chan string)
for i := 0; i < 3; i++ { //later we change i<3 to i<2
go worker(urlCh, sizeCh, i)
}
for _, u := range urls {
urlCh <- u //later: go generator(u, urlCh)
}
for i := 0; i < len(urls); i++ {
fmt.Println(<-sizeCh)
}
}
func worker(urlCh chan string, sizeCh chan string, id int) {
for {
url := <-urlCh
length, err := getPage(url)
if err == nil {
sizeCh <- fmt.Sprintf("%s has legth %d. worker %d", url, length, id)
} else {
sizeCh <- fmt.Sprintf("Error getting %s: %s. worker %d", url, err, id)
}
}
}
func getPage(url string) (int, error) {
resp, err := http.Get(url)
if err != nil {
return 0, err
}
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
return 0, err
}
return len(body), nil
}
The result:
http://reddit.com has legth 110937. worker 0
http://google.com has legth 18719. worker 2
http://yahoo.com has legth 326987. worker 1
But after changing for i := 0; i < 3; i++
(line 15) to for i := 0; i < 2; i++
, namly i < len(urls), we get no result (always waitting...)
In [version 2], we add a helper function into version 1:
func generator(url string, urlCh chan string) {
urlCh <- url
}
and change line 19-21 to:
for _, u := range urls {
go generator(u, urlCh)
}
It works fine even with i<2
:
http://google.com has legth 18701. worker 1
http://reddit.com has legth 112469. worker 0
http://yahoo.com has legth 325752. worker 1
Why does the version 1 fail under condition i<2
(i.e.i<len(urls)
) but version 2 does not?
Upvotes: 1
Views: 96
Reputation: 43899
In your program, you have the following loop iterating over the 3 URLs:
for _, u := range urls {
urlCh <- u //later: go generator(u, urlCh)
}
Since urlCh is unbuffered, the send operation in the loop body will not complete until a corresponding receive operation is performed by another Goroutine.
When you had 3 worker goroutines, this is no problem. When you reduced it to two, it means that at least one goroutine will need to progress far enough to receive a second value from urlCh
.
Now if we look at the body of worker
we can see the problem:
for {
url := <-urlCh
length, err := getPage(url)
if err == nil {
sizeCh <- fmt.Sprintf("%s has legth %d. worker %d", url, length, id)
} else {
sizeCh <- fmt.Sprintf("Error getting %s: %s. worker %d", url, err, id)
}
}
This loop can't complete until it successfully sends a value on sizeCh
. And since this channel is also unbuffered, that won't happen until another goroutine is ready to receive a value from that channel.
Unfortunately the only goroutine that will do that is main
, which only does so when it is finished sending values to urlCh
. Thus we have a deadlock.
Moving the sends to urlCh
to separate goroutines fixes the problem because main
can progress to the point where it is reading from sizeCh
, even though not all values have been sent to urlCh
.
Upvotes: 2