Acting on globals faster than passing them as arguments? [julia-lang]

Question

I just finished studying Julia (and most importantly the performance tips!). I realized that the use of global vars makes the code slower. The counter-measure to that was to pass as many variables as possible to arguments of functions. Therefore I did the following test:

x = 10.5  #these are globals
y = 10.5

function bench1()  #acts on global
  z = 0.0
  for i in 1:100
    z += x^y
  end
  return z
end

function bench2(x, y)
  z = 0.0
  for i in 1:100
    z += x^y
  end
  return z
end

function bench3(x::Float64, y::Float64) #acts on arguments
  z::Float64 = 0.0
  for i in 1:100
    z += x^y
  end
  return z
end

@time [bench1() for j in 1:100]
@time [bench2(x,y) for j in 1:100]
@time [bench3(x,y) for j in 1:100]

I have to admit that the results were extremely unexpected, and they are not in agreement with what I have read. Results:

0.001623 seconds (20.00 k allocations: 313.375 KB)
0.003628 seconds (2.00 k allocations: 96.371 KB)
0.002633 seconds (252 allocations: 10.469 KB)

The average results are that the first function, which acts on global variables directly is always faster by about a factor of 2, than the last function which has all the proper declarations AND does not act directly on global variables. Can someone explain to me why?

David P. Sanders · Accepted Answer

One more problem is that the following are still in global scope:

@time [bench1() for j in 1:100]
@time [bench2(x,y) for j in 1:100]
@time [bench3(x,y) for j in 1:100]

as you can see from the still huge numbers of allocations reported by @time.

Wrap all these in a function:

function runbench(N)
    x = 3.0
    y = 4.0
    @time [bench1() for j in 1:N]
    @time [bench2(x,y) for j in 1:N]
    @time [bench3(x,y) for j in 1:N]
end

warm up with runbench(1), then for runbench(10^5) I get

1.425985 seconds (20.00 M allocations: 305.939 MB, 9.93% gc time)
0.061171 seconds (2 allocations: 781.313 KB)
0.062037 seconds (2 allocations: 781.313 KB)

The total memory allocated in cases 2 and 3 is 10^5 times 8 bytes, as expected.

The moral is to almost ignore the actual timings and just look at the memory allocations, which is where the information about type stability is.

EDIT: bench3 is an "anti-pattern" in Julia (i.e. a style of coding that is not used) -- you should never annotate types just with the intention of trying to fix type instabilities; this is not what type annotations are for in Julia.

Acting on globals faster than passing them as arguments? [julia-lang]

Answers (2)

Related Questions