Reputation: 1827
Is the current Lua compiler smart enough to optimize away local variables that are used for clarity?
local top = x - y
local bottom = x + y
someCall(top, bottom)
Or does inlining things by hand run faster?
someCall(x - y, x + y)
Upvotes: 3
Views: 761
Reputation: 4271
Since Lua often compiles source code into byte code on the fly, it is designed to be a fast single-pass compiler. It does do some constant folding, but other than that there are not many optimizations. You can usually check what the compiler does by executing luac -l -l -p file.lua
and looking at the generated (disassembled) byte code.
In your case the Lua code
function a( x, y )
local top = x - y
local bottom = x + y
someCall(top, bottom)
end
function b( x, y )
someCall(x - y, x + y)
end
results int the following byte code listing when run through luac5.3 -l -l -p file.lua
(some irrelevant parts skipped):
function <file.lua:1,5> (7 instructions at 0xcd7d30)
2 params, 7 slots, 1 upvalue, 4 locals, 1 constant, 0 functions
1 [2] SUB 2 0 1
2 [3] ADD 3 0 1
3 [4] GETTABUP 4 0 -1 ; _ENV "someCall"
4 [4] MOVE 5 2
5 [4] MOVE 6 3
6 [4] CALL 4 3 1
7 [5] RETURN 0 1
constants (1) for 0xcd7d30:
1 "someCall"
locals (4) for 0xcd7d30:
0 x 1 8
1 y 1 8
2 top 2 8
3 bottom 3 8
upvalues (1) for 0xcd7d30:
0 _ENV 0 0
function <file.lua:7,9> (5 instructions at 0xcd7f10)
2 params, 5 slots, 1 upvalue, 2 locals, 1 constant, 0 functions
1 [8] GETTABUP 2 0 -1 ; _ENV "someCall"
2 [8] SUB 3 0 1
3 [8] ADD 4 0 1
4 [8] CALL 2 3 1
5 [9] RETURN 0 1
constants (1) for 0xcd7f10:
1 "someCall"
locals (2) for 0xcd7f10:
0 x 1 6
1 y 1 6
upvalues (1) for 0xcd7f10:
0 _ENV 0 0
As you can see, the first variant (the a
function) has two additional MOVE
instructions, and two additional locals.
If you are interested in the details of the opcodes, you can check the comments for the OpCode
enum in lopcodes.h.
E.g. the opcode format for OP_ADD
is:
OP_ADD,/* A B C R(A) := RK(B) + RK(C) */
So the 2 [3] ADD 3 0 1
from above takes the values from registers 0 and 1 (the locals x
and y
in this case), adds them together, and stores the result in register 3. It is the second opcode in this function and the corresponding source code is on line 3.
Upvotes: 7