Reputation:
Here is a code snippet from the book "The Go Programming Language":
for t := 0.0; t < cycles*2*math.Pi; t += res {
...
}
It appears that the expression in the for loop condition t < cycles*2*math.Pi
must be evaluated before every iteration of the for loop. Or, does the compiler optimize this by pre-calculating the result of the expression (assuming none of the variables change during the iteration)? Does the above style of coding affect performance?
Upvotes: 4
Views: 1228
Reputation: 16898
The current Go compiler does not move loop invariant computations outside loops.
The list of passes of the compiler can be seen here https://github.com/golang/go/blob/master/src/cmd/compile/internal/ssa/compile.go#L329
In the example by @creker, the compiler did constant folding, not loop invariant code motion.
As a side note, I did make a few months ago a LICM pass for the Go compiler https://github.com/golang/go/compare/master...momchil-velikov:dev.chill.licm
which does not improve performance very much on the typically used Go benchmarks. (I blame the atrocious register allocation :P)
Upvotes: 1
Reputation: 9570
This really depends on the Go version but go version go1.7 windows/amd64
does appear to calculate the value once.
Go code:
var cycles = 10.0
var res = 1000.0
for t := 0.0; t < cycles*2*math.Pi; t += res {
}
Asm code:
movsd [rsp+58h+var_20], xmm0
mov [rsp+58h+var_18], 0
mov [rsp+58h+var_10], 0
lea rax, qword_494500
mov [rsp+58h+var_58], rax
lea rax, [rsp+58h+var_20]
mov [rsp+58h+var_50], rax
mov [rsp+58h+var_48], 0
call runtime_convT2E
mov rax, [rsp+58h+var_40]
mov rcx, [rsp+58h+a] ; a
mov [rsp+58h+var_18], rax
mov [rsp+58h+var_10], rcx
lea rax, [rsp+58h+var_18]
mov [rsp+58h+var_58], rax
mov [rsp+58h+var_50], 1
mov [rsp+58h+var_48], 1
call fmt_Println
movsd xmm0, cs:$f64_408f400000000000
movsd xmm1, [rsp+58h+t]
addsd xmm0, xmm1
movsd [rsp+58h+t], xmm0
movsd xmm1, cs:$f64_404f6a7a2955385e
ucomisd xmm1, xmm0
ja loc_401083
f64_404f6a7a2955385e
is a precalculated double value equal to 10 * 2 * math.Pi
or 62.8318530718
Go compiler recently switched to SSA, so these kind of optimizations will just keep improving as they greatly benefit from it. For now SSA is only available on amd64:
Compiler Toolchain
This release includes a new code generation back end for 64-bit x86 systems, following a proposal from 2015 that has been under development since then. The new back end, based on SSA, generates more compact, more efficient code and provides a better platform for optimizations such as bounds check elimination.
1.8 should have it for all supported architectures:
Compiler Toolchain
Go 1.7 introduced a new compiler back end for 64-bit x86 systems. In Go 1.8, that back end has been developed further and is now used for all architectures.
Upvotes: 5
Reputation: 1327784
It does not seems optimized, and is not listed in "Compiler And Runtime Optimizations".
As mentioned in this older discussion,
The gc compiler does not do any loop optimizations. One of the major goals of that compiler is to compile quickly. While improved optimization is always useful, it has to fit within that goal
But that could have changed. A good technique to see what is going on is illustrated in "Recursion And Tail Calls In Go", where you can look at the assembly code produced by a go program.
See also the more recent 2016 "Reversing GO binaries like a pro" article.
And "go.godbolt.org" can help too: see the assembly code here.
You can see the "t < cycles*2*math.Pi
" part always evaluated
.L2:
movsd xmm0, QWORD PTR [rbp-24]
addsd xmm0, xmm0
movsd xmm1, QWORD PTR .LC2[rip]
mulsd xmm0, xmm1
ucomisd xmm0, QWORD PTR [rbp-8]
seta al
test al, al
Upvotes: 1