Bogdan  Dubyk
Bogdan Dubyk

Reputation: 5564

confusing concurrency and performance issue in Go

now I start learning Go language by watching this great course. To be clear for years I write only on PHP and concurrency/parallelism is new for me, so I little confused by this.

In this course, there is a task to create a program which calculates factorial with 100 computations. I went a bit further and to comparing performance I changed it to 10000 and for some reason, the sequential program works same or even faster than concurrency.

Here I'm going to provide 3 solutions: mine, teachers and sequential

My solution:

package main

import (
 "fmt"
)

func gen(steps int) <-chan int{
     out := make(chan int)
     go func() {
         for j:= 0; j <steps; j++ {
             out <- j
         }
         close(out)
      }()
      return out
}

func factorial(in <-chan int) <-chan int {
    out := make(chan int) 
    go func() {
        for n := range in {
            out <-  fact(n)
        }
        close(out)
    }()
    return out
}

func fact(n int) int {
    total := 1
    for i := n;i>0;i-- {
        total *=i
    }
    return total
}

func main() {
    steps := 10000
    for i := 0; i < steps; i++ {
        for n:= range factorial(gen(10)) {
            fmt.Println(n)
        }
     }
}

execution time:


Teacher solution: package main

import (
 "fmt"
)

func gen(steps int) <-chan int{
    out := make(chan int)
    go func() {
        for i := 0; i < steps; i++ {
            for j:= 0; j <10; j++ {
                out <- j
            }
        }
        close(out)
    }()
    return out
}

func factorial(in <-chan int) <-chan int {
    out := make(chan int) 
    go func() {
        for n := range in {
            out <-  fact(n)
        }
        close(out)
    }()
    return out
}

func fact(n int) int {
    total := 1
    for i := n;i>0;i-- {
        total *=i
    }
    return total
}

func main() {
    steps := 10000
    for n:= range factorial(gen(steps)) {
        fmt.Println(n)
    }
}

execution time:


Sequential:

package main

import (
 "fmt"
)

func fact(n int) int {
    total := 1
    for i := n;i>0;i-- {
        total *=i
    }
    return total
}

func main() {
    steps := 10000
    for i := 0; i < steps; i++ {
        for j:= 0; j <10; j++ {
            fmt.Println(fact(j))
        }
    }
}

execution time:


So, as you can see the sequential solution is fastest, teachers solution is in the second place and my solution is third.

First question: why the sequential solution is fastest? And second why my solution is so slow? if I understanding correctly in my solution I'm creating 10000 goroutines inside gen and 10000 inside factorial and in teacher solution, he is creating only 1 goroutine in gen and 1 in factorial. My so slow because I'm creating too many unneeded goroutines?

Upvotes: 3

Views: 1005

Answers (2)

peterSO
peterSO

Reputation: 166835

Let's start with some fundamental benchmarks for factorial computation.

$ go test -run=! -bench=. factorial_test.go 
goos: linux
goarch: amd64
BenchmarkFact0-4            1000000000           2.07 ns/op
BenchmarkFact9-4            300000000            4.37 ns/op
BenchmarkFact0To9-4         50000000            36.0 ns/op
BenchmarkFact10K0To9-4          3000        384069 ns/op
$ 

The CPU time is very small, even for 10,000 iterations of factorials zero through nine.

factorial_test.go:

package main

import "testing"

func fact(n int) int {
    total := 1
    for i := n; i > 0; i-- {
        total *= i
    }
    return total
}

var sinkFact int

func BenchmarkFact0(b *testing.B) {
    for N := 0; N < b.N; N++ {
        j := 0
        sinkFact = fact(j)
    }
}

func BenchmarkFact9(b *testing.B) {
    for N := 0; N < b.N; N++ {
        j := 9
        sinkFact = fact(j)
    }
}

func BenchmarkFact0To9(b *testing.B) {
    for N := 0; N < b.N; N++ {
        for j := 0; j < 10; j++ {
            sinkFact = fact(j)
        }
    }
}

func BenchmarkFact10K0To9(b *testing.B) {
    for N := 0; N < b.N; N++ {
        steps := 10000
        for i := 0; i < steps; i++ {
            for j := 0; j < 10; j++ {
                sinkFact = fact(j)
            }
        }
    }
}

Let's look at the time for the sequential program.

$ go build -a sequential.go && time ./sequential
real    0m0.247s
user    0m0.054s
sys     0m0.149s

Writing to the terminal is obviously a major bottleneck. Let's write to a sink.

$ go build -a sequential.go && time ./sequential > /dev/null
real    0m0.070s
user    0m0.049s
sys     0m0.020s

It's still a lot more than the 0m0.000000384069s for the factorial computation.

sequential.go:

package main

import (
    "fmt"
)

func fact(n int) int {
    total := 1
    for i := n; i > 0; i-- {
        total *= i
    }
    return total
}

func main() {
    steps := 10000
    for i := 0; i < steps; i++ {
        for j := 0; j < 10; j++ {
            fmt.Println(fact(j))
        }
    }
}

Attempts to use concurrency for such a trivial amount of parallel work are likely to fail. Go goroutines and channels are cheap, but they are not free. Also, a single channel and a single terminal are the bottleneck, the limiting factor, even when writing to a sink. See Amdahl's Law for parallel computing. See Concurrency is not parallelism.

$ go build -a teacher.go && time ./teacher > /dev/null
real    0m0.123s
user    0m0.123s
sys     0m0.022s

$ go build -a student.go && time ./student > /dev/null
real    0m0.135s
user    0m0.113s
sys     0m0.038s

teacher.go:

package main

import (
    "fmt"
)

func gen(steps int) <-chan int {
    out := make(chan int)
    go func() {
        for i := 0; i < steps; i++ {
            for j := 0; j < 10; j++ {
                out <- j
            }
        }
        close(out)
    }()
    return out
}

func factorial(in <-chan int) <-chan int {
    out := make(chan int)
    go func() {
        for n := range in {
            out <- fact(n)
        }
        close(out)
    }()
    return out
}

func fact(n int) int {
    total := 1
    for i := n; i > 0; i-- {
        total *= i
    }
    return total
}

func main() {
    steps := 10000
    for n := range factorial(gen(steps)) {
        fmt.Println(n)
    }
}

student.go:

package main

import (
    "fmt"
)

func gen(steps int) <-chan int {
    out := make(chan int)
    go func() {
        for j := 0; j < steps; j++ {
            out <- j
        }
        close(out)
    }()
    return out
}

func factorial(in <-chan int) <-chan int {
    out := make(chan int)
    go func() {
        for n := range in {
            out <- fact(n)
        }
        close(out)
    }()
    return out
}

func fact(n int) int {
    total := 1
    for i := n; i > 0; i-- {
        total *= i
    }
    return total
}

func main() {
    steps := 10000
    for i := 0; i < steps; i++ {
        for n := range factorial(gen(10)) {
            fmt.Println(n)
        }
    }
}

Upvotes: 1

GilesW
GilesW

Reputation: 515

It's the difference between concurrency and parellelism - your's, you teachers and the sequential are progressively less concurrent in design but how parallel they are depends on number of CPU cores and there is a set up and communication cost associated with concurrency. There are no asynchronous calls in the code so only parallelism will improve speed.

This is worth a look: https://blog.golang.org/concurrency-is-not-parallelism

Also, even with parallel cores speedup will be dependent on nature of the workload - google Amdahl's law for explanation.

Upvotes: 2

Related Questions