Ben S.
Ben S.

Reputation: 3545

Fastest way to draw from a distribution many times

This:

function draw1(n)
    return rand(Normal(0,1), Int(n))
end

is somewhat faster than this:

function draw2(n)
    result = zeros(Float64, Int(n))
    for i=1:Int(n)
        result[i] =  rand(Normal(0,1))
    end
    return result 
end

Just curious why that is, and if the explicit loop way can be speeded up (I tried @inbounds and @simd and didn't get a speedup). Is it the initial allocation of zeros()? I timed that separately at about 0.25 seconds, which doesn't fully account for the difference (plus doesn't the first way pre-allocate an array under the hood?).

Example:

@time x = draw1(1e08)
  1.169986 seconds (6 allocations: 762.940 MiB, 4.53% gc time)
@time y = draw2(1e08)
  1.824750 seconds (6 allocations: 762.940 MiB, 3.05% gc time)

Upvotes: 1

Views: 106

Answers (2)

Michael K. Borregaard
Michael K. Borregaard

Reputation: 8044

A shorter answer is that the built-in implementation is fastest, which is fortunately often the case.

Instead of draw4 above, you could just use the inbuilt

function draw5(n)
   result = Vector{Float64}(Int(n))
   rand!(Normal(0,1), result)
end

Filling an existing vector with something like rand! will always be inbounds.

Upvotes: 3

Bogumił Kamiński
Bogumił Kamiński

Reputation: 69949

Try this implementation:

function draw3(n)
    d = Normal(0,1)
    result = Vector{Float64}(Int(n))
    @inbounds for i=1:Int(n)
        result[i] =  rand(d)
    end
    return result 
end

What is the difference:

  • uses @inbounds
  • creates Normal(0,1) only once
  • performs faster initialization of result

When I test it it has essentially the same performance as draw1 (I have not tested it on 10e8 vector size though (not enough memory) - if you can run such @benchmark it would be nice):

julia> using BenchmarkTools                        

julia> @benchmark draw1(10e5)                      
BenchmarkTools.Trial:                              
  memory estimate:  7.63 MiB                       
  allocs estimate:  2                              
  --------------                                   
  minimum time:     12.296 ms (0.00% GC)           
  median time:      13.012 ms (0.00% GC)           
  mean time:        14.510 ms (8.49% GC)           
  maximum time:     84.253 ms (81.30% GC)          
  --------------                                   
  samples:          345                            
  evals/sample:     1                              

julia> @benchmark draw2(10e5)                      
BenchmarkTools.Trial:                              
  memory estimate:  7.63 MiB                       
  allocs estimate:  2                              
  --------------                                   
  minimum time:     20.374 ms (0.00% GC)           
  median time:      21.622 ms (0.00% GC)           
  mean time:        22.787 ms (5.95% GC)           
  maximum time:     92.265 ms (77.18% GC)          
  --------------                                   
  samples:          220                            
  evals/sample:     1                              

julia> @benchmark draw3(10e5)                      
BenchmarkTools.Trial:                              
  memory estimate:  7.63 MiB                       
  allocs estimate:  2                              
  --------------                                   
  minimum time:     12.415 ms (0.00% GC)           
  median time:      12.956 ms (0.00% GC)           
  mean time:        14.456 ms (8.67% GC)           
  maximum time:     84.342 ms (83.74% GC)          
  --------------                                   
  samples:          346                            
  evals/sample:     1                              

EDIT: actually defining a loop in a separate function (exactly as rand does) gives a bit better performance of draw4 than draw3:

function g!(d, v)
    @inbounds for i=1:length(v)
        v[i] = rand(d)
    end
end

function draw4(n)
    result = Vector{Float64}(Int(n))
    g!(Normal(0,1), result)
    return result 
end

Upvotes: 3

Related Questions