Is using arithmetic faster than storing a variable?

Question

In C (or possibly in general) is it faster to use arithmetic to get a value, or call it from an array/variable?

For example, if I had

int myarray[7] = {16};
int mysixteen = 16;

then I can get 16 in a number of different ways

myarray[#]
mysixteen
16
1 << 4
10 + 6

Logically 16 would be the fastest, but that's not always convenient or plausible for a set of numbers. An example of where this might be relevant is precomputing tables. Say you need bitmasks for 64 bits, you could fill an array

for (int i = 0; i < 64; ++i) {
    mask[i] = 1 << i;
}

and make calls to the array in the future, or make a macro

#define mask(b) (1 << b)

and call that.

Frerich Raabe · Accepted Answer

In general, any of

16
1 << 4
10 + 6

Will result in a literal 16 because the compiler most certainly implements an optimization called constant folding.

The performance of

mysixteen
myarray[n]

is probably lower depending on where the value of those variables is stored. In memory? If so, is the memory in any of the CPU caches? Or is it stored in one of the CPU registers? There's no definitive answer.

I general, for a specific program, you can always see what your compiler gives you - but note that this may change a lot depending on surrounding code and your optimization flags.

To try it yourself, consider this small program:

int f() { return 16; }

int g() { return 1 << 4; }

int h() { return 10 + 6; }

int i() {
    int myarray[7] = { 16 };
    return myarray[3];
}

int j() {
    int mysixteen = 16;
    return mysixteen;
}

If I compile it using gcc 4.7.2 and then check the disassembly, like

$ gcc -c so19802742.c -o so19802742.o
$ objdump --disassemble so19802742.o

I get this:

0000000000000000 :
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   b8 10 00 00 00          mov    $0x10,%eax
   9:   5d                      pop    %rbp
   a:   c3                      retq   

000000000000000b :
   b:   55                      push   %rbp
   c:   48 89 e5                mov    %rsp,%rbp
   f:   b8 10 00 00 00          mov    $0x10,%eax
  14:   5d                      pop    %rbp
  15:   c3                      retq   

0000000000000016 :
  16:   55                      push   %rbp
  17:   48 89 e5                mov    %rsp,%rbp
  1a:   b8 10 00 00 00          mov    $0x10,%eax
  1f:   5d                      pop    %rbp
  20:   c3                      retq   

0000000000000021 :
  21:   55                      push   %rbp
  22:   48 89 e5                mov    %rsp,%rbp
  25:   48 c7 45 e0 00 00 00    movq   $0x0,-0x20(%rbp)
  2c:   00 
  2d:   48 c7 45 e8 00 00 00    movq   $0x0,-0x18(%rbp)
  34:   00 
  35:   48 c7 45 f0 00 00 00    movq   $0x0,-0x10(%rbp)
  3c:   00 
  3d:   c7 45 f8 00 00 00 00    movl   $0x0,-0x8(%rbp)
  44:   c7 45 e0 10 00 00 00    movl   $0x10,-0x20(%rbp)
  4b:   8b 45 ec                mov    -0x14(%rbp),%eax
  4e:   5d                      pop    %rbp
  4f:   c3                      retq   

0000000000000050 :
  50:   55                      push   %rbp
  51:   48 89 e5                mov    %rsp,%rbp
  54:   c7 45 fc 10 00 00 00    movl   $0x10,-0x4(%rbp)
  5b:   8b 45 fc                mov    -0x4(%rbp),%eax
  5e:   5d                      pop    %rbp
  5f:   c3                      retq

Note how due to constant folding, f, g and h yield exactly the same machine code. The array access in i causes the most machine code (but not necessarily the slowest!) and j is kind of inbetween.

However, this is without any more complicated code optimizations at all! The code generated when compiling with e.g. -O2 may be totally different because the compiler notices that calls to any of the five functions are equivalent to just using the constant 16!

Is using arithmetic faster than storing a variable?

Answers (2)

Related Questions