Reputation: 3465
I'm trying to understand more about go's channels and goroutines, so I decided to make a little program that count words from a file, read by a bufio.NewScanner
object:
nCPUs := flag.Int("cpu", 2, "number of CPUs to use")
flag.Parse()
runtime.GOMAXPROCS(*nCPUs)
scanner := bufio.NewScanner(file)
lines := make(chan string)
results := make(chan int)
for i := 0; i < *nCPUs; i++ {
go func() {
for line := range lines {
fmt.Printf("%s\n", line)
results <- len(strings.Split(line, " "))
}
}()
}
for scanner.Scan(){
lines <- scanner.Text()
}
close(lines)
acc := 0
for i := range results {
acc += i
}
fmt.Printf("%d\n", acc)
Now, in most examples I've found so far both the lines
and results
channels would be buffered, such as make(chan int, NUMBER_OF_LINES_IN_FILE)
. Still, after running this code, my program exists with a fatal error: all goroutines are asleep - deadlock!
error message.
Basically my thought it's that I need two channels: one to communicate to the goroutine the lines from the file (as it can be of any size, I don't like to think that I need to inform the size in the make(chan)
function call. The other channel would collect the results from the goroutine and in the main function I would use it to e.g. calculate an accumulated result.
What should be the best option to program in this manner with goroutines and channels? Any help is much appreciated.
Upvotes: 2
Views: 2686
Reputation: 18567
As @AndrewN has pointed out, the problem is each goroutine gets to the point where it's trying to send to the results
channel, but those sends will block because the results
channel is unbuffered and nothing reads from them until the for i := range results
loop. You never get to that loop, because you first need to finish the for scanner.Scan()
loop, which is trying to send all the line
s down the lines
channel, which is blocked because the goroutines are never looping back to the range lines
because they're stuck sending to results
.
The first thing you might try to do to fix this is to put the scanner.Scan()
stuff in a goroutine, so that something can start reading off the results
channel right away. However, the next problem you'll have is knowing when to end the for i := range results
loop. You want to have something close the results
channel, but only after the original goroutines are done reading off the lines
channel. You could close the results
channel right after closing the lines
channel, however I think that might introduce a potential race, so the safest thing to do is also wait for the original two goroutines to be done before closing the results
channel: (playground link):
package main
import "fmt"
import "runtime"
import "bufio"
import "strings"
import "sync"
func main() {
runtime.GOMAXPROCS(2)
scanner := bufio.NewScanner(strings.NewReader(`
hi mom
hi dad
hi sister
goodbye`))
lines := make(chan string)
results := make(chan int)
wg := sync.WaitGroup{}
for i := 0; i < 2; i++ {
wg.Add(1)
go func() {
for line := range lines {
fmt.Printf("%s\n", line)
results <- len(strings.Split(line, " "))
}
wg.Done()
}()
}
go func() {
for scanner.Scan() {
lines <- scanner.Text()
}
close(lines)
wg.Wait()
close(results)
}()
acc := 0
for i := range results {
acc += i
}
fmt.Printf("%d\n", acc)
}
Upvotes: 7
Reputation: 425
Channels in go are unbuffered by default, which means that none of the anonymous goroutines you spawn can send to the results channel until you start trying to receive from that channel. That doesn't start executing in the main program until scanner.Scan() is done filling up the line channel...which it's blocked from doing until your anonymous functions can send to the results channel and restart their loops. Deadlock.
The other problem in your code, even when trivially fixing the above by buffering the channels, is that for i := range results will also deadlock once there are no more results being fed into it, since the channel hasn't been closed.
Edit: Here's one potential solution, if you want to avoid buffered channels. Basically, the first issue is avoided by performing the send to the results channel via a new goroutine, allowing the lines loop to complete. The second issue (not knowing when to stop reading a channel) is avoided by counting the goroutines as they are created and explicitly closing down the channel when every goroutine is accounted for. It's probably better to do something similar with waitgroups, but this is just a very fast way to show how to do this unbuffered.
Upvotes: 5