2mac
2mac

Reputation: 1689

Variable storage versus redundant arithmetic

I'm writing a very simple loop in Lua for a LÖVE game I'm developing. I understand I'll waste more time worrying about this than will ever be spent on any CPU clock cycles the answer to this question saves me, but I want a deeper knowledge of how this works.

The current body of the loop is like so:

  local low = mid - diff
  local high = mid + diff

  love.graphics.line(low, 0, low, wheight)
  love.graphics.line(high, 0, high, wheight)

I want to know if it will be more computationally efficient to keep it as is or to change it to:

  love.graphics.line(mid - diff, 0, mid - diff, wheight)
  love.graphics.line(mid + diff, 0, mid + diff, wheight)

With the second body, I have to calculate the low and high differences twice each. With the first, I have to store them in memory and access them twice each.

Which is more efficient?

Upvotes: 1

Views: 66

Answers (1)

user4842163
user4842163

Reputation:

The short answer is that it'll be unlikely to make any difference at all. Even if there is any kind of difference, your code next to it is drawing a line, for example. Drawing even an aliased line with very optimized Bresenham implemented in native code is enormously expensive in comparison to an add and subtract. Even the function call alone will likely dwarf this cost.

With the second body, I have to calculate the low and high differences twice each. With the first, I have to store them in memory and access them twice each.

This is not necessarily the case. Variables don't necessarily "store memory" in ways that expressions don't. They can directly map to a register. Likewise, avoiding variables doesn't necessarily "avoid memory". Expressions will likewise be computed and stored in registers, whether you explicitly assign the intermediate results to variables or not.

So from a memory standpoint, both versions of your code need to use registers to store intermediate results of a computation.

Memoization doesn't necessarily have that kind of memory overhead when you're just involving simple variables mainly because the types map directly to registers without stack spills. When you start computing whole arrays/tables in advance, sometimes doing additional computation will be faster than memoization if the memoization means more DRAM access (in which case the memory overhead can outweigh the savings). But simple POD-type variables like numbers don't have that DRAM overhead, they map directly to registers. In other words, they're often literally free: the compiler will emit the same machine code whether or not you assigned the result of your expressions to local variables or not -- the same number of registers will be required.

Local variables for data types that map directly to GP registers are best thought as only existing while you're in that high-level coding land. By the time the JIT or interpreter compiles your code into a form the machine understands, they'll disappear and turn into registers regardless of whether you created those variables or not.

Probably the ultimate question, if there is to be any difference, is whether the redundant computation can be eliminated. It would take only the most trivial optimizer to figure out that mid - diff written twice in the exact same statement only needs to be computed once. I'd be surprised if that didn't get optimized away by the time it reaches the IR instruction selection and register allocation stage.

But even if it turned out to be a surprise, and the Lua interpreter was so inefficient as to fail to recognize the completely redundant computation and performed it anyway, again, you have code next to it that renders a line (which involves loopy rasterization). Relatively speaking, this is practically free even with the redundancy. Here it's not worth sweating such small stuff, and this is coming from someone obsessed with shaving clock cycles.

Upvotes: 2

Related Questions