Reputation: 369
i'm working on a program that allocate lots of []int with length 4,3,2
and found using a:=[]{1,1,1}
is a little bit fast than a:=make([]int,3) a[0] = 1 a[1]=1 a[2]= 1
my question: why a:=[]{1,1,1}
is faster than a:=make([]int,3) a[0] = 1 a[1]=1 a[2]= 1
?
func BenchmarkMake(b *testing.B) {
var array []int
for i := 0; i < b.N; i++ {
array = make([]int, 4)
array[0] = 1
array[1] = 1
array[2] = 1
array[3] = 1
}
}
func BenchmarkDirect(b *testing.B) {
var array []int
for i := 0; i < b.N; i++ {
array = []int{1, 1, 1, 1}
}
array[0] = 1
}
BenchmarkMake-4 50000000 34.3 ns/op
BenchmarkDirect-4 50000000 33.8 ns/op
Upvotes: 1
Views: 134
Reputation: 49171
Let's look at benchmark output of the following code
package main
import "testing"
func BenchmarkMake(b *testing.B) {
var array []int
for i := 0; i < b.N; i++ {
array = make([]int, 4)
array[0] = 1
array[1] = 1
array[2] = 1
array[3] = 1
}
}
func BenchmarkDirect(b *testing.B) {
var array []int
for i := 0; i < b.N; i++ {
array = []int{1, 1, 1, 1}
}
array[0] = 1
}
func BenchmarkArray(b *testing.B) {
var array [4]int
for i := 0; i < b.N; i++ {
array = [4]int{1, 1, 1, 1}
}
array[0] = 1
}
Usually the output looks like that
$ go test -bench . -benchmem -o alloc_test -cpuprofile cpu.prof
goos: linux
goarch: amd64
pkg: test
BenchmarkMake-8 30000000 61.3 ns/op 32 B/op 1 allocs/op
BenchmarkDirect-8 20000000 60.2 ns/op 32 B/op 1 allocs/op
BenchmarkArray-8 1000000000 2.56 ns/op 0 B/op 0 allocs/op
PASS
ok test 6.003s
The difference is so small that it can be the opposite in some circumstances.
Let's look at the profiling data
$go tool pprof -list 'Benchmark.*' cpu.prof
ROUTINE ======================== test.BenchmarkMake in /home/grzesiek/go/src/test/alloc_test.go
260ms 1.59s (flat, cum) 24.84% of Total
. . 5:func BenchmarkMake(b *testing.B) {
. . 6: var array []int
40ms 40ms 7: for i := 0; i < b.N; i++ {
50ms 1.38s 8: array = make([]int, 4)
. . 9: array[0] = 1
130ms 130ms 10: array[1] = 1
20ms 20ms 11: array[2] = 1
20ms 20ms 12: array[3] = 1
. . 13: }
. . 14:}
ROUTINE ======================== test.BenchmarkDirect in /home/grzesiek/go/src/test/alloc_test.go
90ms 1.66s (flat, cum) 25.94% of Total
. . 16:func BenchmarkDirect(b *testing.B) {
. . 17: var array []int
10ms 10ms 18: for i := 0; i < b.N; i++ {
80ms 1.65s 19: array = []int{1, 1, 1, 1}
. . 20: }
. . 21: array[0] = 1
. . 22:}
ROUTINE ======================== test.BenchmarkArray in /home/grzesiek/go/src/test/alloc_test.go
2.86s 2.86s (flat, cum) 44.69% of Total
. . 24:func BenchmarkArray(b *testing.B) {
. . 25: var array [4]int
500ms 500ms 26: for i := 0; i < b.N; i++ {
2.36s 2.36s 27: array = [4]int{1, 1, 1, 1}
. . 28: }
. . 29: array[0] = 1
. . 30:}
We can see that assignments takes some time.
To learn why we need to see the assembler code.
$go tool pprof -disasm 'BenchmarkMake' cpu.prof
. . 4eda93: MOVQ AX, 0(SP) ;alloc_test.go:8
30ms 30ms 4eda97: MOVQ $0x4, 0x8(SP) ;test.BenchmarkMake alloc_test.go:8
. . 4edaa0: MOVQ $0x4, 0x10(SP) ;alloc_test.go:8
10ms 1.34s 4edaa9: CALL runtime.makeslice(SB) ;test.BenchmarkMake alloc_test.go:8
. . 4edaae: MOVQ 0x18(SP), AX ;alloc_test.go:8
10ms 10ms 4edab3: MOVQ 0x20(SP), CX ;test.BenchmarkMake alloc_test.go:8
. . 4edab8: TESTQ CX, CX ;alloc_test.go:9
. . 4edabb: JBE 0x4edb0b
. . 4edabd: MOVQ $0x1, 0(AX)
130ms 130ms 4edac4: CMPQ $0x1, CX ;test.BenchmarkMake alloc_test.go:10
. . 4edac8: JBE 0x4edb04 ;alloc_test.go:10
. . 4edaca: MOVQ $0x1, 0x8(AX)
20ms 20ms 4edad2: CMPQ $0x2, CX ;test.BenchmarkMake alloc_test.go:11
. . 4edad6: JBE 0x4edafd ;alloc_test.go:11
. . 4edad8: MOVQ $0x1, 0x10(AX)
. . 4edae0: CMPQ $0x3, CX ;alloc_test.go:12
. . 4edae4: JA 0x4eda65
We can see that the time is taken by CMPQ command that compares constant with CX register. The CX register is the value copied from stack after call to make. We can deduce that it must be the size of slice while AX holds the reference to an underlying array. You can also see that the first bound check was optimized.
Conclusions
Why is using array so much cheaper?
In Go the array is basically a chunk of memory of fixed size. The [1]int
is basically the same thing as int
. You can find more in in Go Slices: usage and internals article.
Upvotes: 1