Reputation: 15577
Scenario:
I want to read a large text file fast (e.g. lorem.txt
with 4.5 millon lines in example below) . I have tried three different ways in code below.
Pseudo-Benchmarks: Here is a typical output I get when I run this below; output below is quick and dirty time deltas and not full-fledged profiling / bench marking / testing.
sequential: 43.541828091 secs 4714074 lines
queued channel: 80.986544385 secs 4714074 lines
wait group: 260.200473751 secs 4712266 lines
Question: Why are sequential loops faster than two other methods below? Am I missing something?
Update I ran the example code on a much bigger text file (see Scenario). I also "reset" the logger for each example just in case not doing so would have some compounding memory issue between the example functions. Also as others have noted, my computer is dual core, which may be one of many issues with my code. Thanks for all feedback / answers.
package main
import (
"bufio"
"bytes"
"fmt"
"log"
"os"
"sync"
"time"
)
var (
textFile = "lorem.txt"
buf bytes.Buffer
l = log.New(&buf, "logger: ", log.Lshortfile)
wg sync.WaitGroup
delta1 float64
delta2 float64
delta3 float64
cnt1 = 0
cnt2 = 0
cnt3 = 0
)
func main() {
// Wait Group Example
exampleWaitGroup()
// Queued Channel Example
exampleQueuedChannel()
// Sequential Loop Example
exampleSequentialLoop()
benchmarks := fmt.Sprintf("sequential:\t%v secs %v lines\nqueued channel:\t%v secs %v lines\nwait group:\t%v secs %v lines\n",
delta1, cnt1,
delta2, cnt2,
delta3, cnt3,
)
fmt.Println(benchmarks)
}
func exampleSequentialLoop() {
buf.Reset()
l = log.New(&buf, "logger: ", log.Lshortfile)
start := time.Now()
file1, err := os.Open(textFile)
if err != nil {
log.Fatal(err)
}
defer file1.Close()
scanner := bufio.NewScanner(file1)
for scanner.Scan() {
cnt1++
l.Println(scanner.Text())
}
end := time.Now()
delta1 = end.Sub(start).Seconds()
}
func exampleQueuedChannel() {
buf.Reset()
l = log.New(&buf, "logger: ", log.Lshortfile)
start := time.Now()
queue := make(chan string)
done := make(chan bool)
go processQueue(queue, done)
file2, err := os.Open(textFile)
if err != nil {
log.Fatal(err)
}
defer file2.Close()
scanner := bufio.NewScanner(file2)
for scanner.Scan() {
queue <- scanner.Text()
}
end := time.Now()
delta2 = end.Sub(start).Seconds()
}
func exampleWaitGroup() {
buf.Reset()
l = log.New(&buf, "logger: ", log.Lshortfile)
start := time.Now()
file3, err := os.Open(textFile)
if err != nil {
log.Fatal(err)
}
defer file3.Close()
scanner := bufio.NewScanner(file3)
for scanner.Scan() {
wg.Add(1)
go func(line string) {
defer wg.Done()
l.Println(line)
cnt3++
}(scanner.Text())
}
wg.Wait()
end := time.Now()
delta3 = end.Sub(start).Seconds()
}
func processQueue(queue chan string, done chan bool) {
for line := range queue {
l.Println(line)
cnt2++
}
done <- true
}
Upvotes: 3
Views: 1669
Reputation: 131
I don't think there should be a performance gain. In a producer/consumer scenario, introducing concurrency(for consumers) only makes sense if your consumer is slower than the producer. You should expect a performance gain the that case by introducing multiple consumers.
But here the consumer is already significantly faster than the producer(the IO) hence no performance gain.
Upvotes: 1
Reputation: 8232
Before answering the question, I would like to point out that your method of benchmarking is problematic. I would not say a 200-line file is large or sufficent for purpose of benchmarking. And there is an "official" way of benchmarking in Go, which you can read in docs of testing.
There is a Go Idiom, probably the most famouse one, saying: Concurrency is not Parallelism. What most expects to make program run faster is parallelism, not concurrency. In fact, on a single-core CPU, where parallelism is not possible, concurrency usually makes thing slower, because switching between goroutines (threads, coroutine, whatsoever), has costs.
In your code, it is much like the situation of single-core. There is not much parellelism, but much switching between goroutines. Furthermore, fmt.Println
includes an IO operation, and that operation requires synchronizing, which cannot bennefit from parellelism.
Upvotes: 2
Reputation: 8846
In exampleQueuedChannel
you're not doing anything in parallel. Yes you've launched another goroutine, but there is no parallel processing. The reason being is that queue
is an unbuffered chan. When you write to an unbuffered chan, the writer blocks until someone reads off it. So essentially you block while writing, then the scheduler has to put the goroutine to sleep, and go wake up the reading goroutine. Then that one goes to sleep and the writer wakes back up. So you're thrashing between the two goroutines and the scheduler is getting a heavy workout.
If you want to get better performance here, use a buffered chan. And if you want even more performance, bulk up multiple items per chan message (For a long technical explanation of the performance impact of channels, read this).
In exampleWaitGroup
, you're launching a new goroutine for every line. While launching new goroutines isn't expensive, it's not free either, and is also more work for the scheduler. defer
is also not free. Also your logger uses a mutex, so if two of your goroutines try to log at the same time, one is going to be put to sleep, and again, more scheduler work.
You can investigate these matters on your own by launching your code under a profiler, and investigating where the bottlenecks lie.
Upvotes: 5