Julia code optimization, difference between structs and primitive types? (memory allocs)

Question

I have some code to optimize with some critical parts for which I do not want the gc to trigger memory allocation.

To be more precise I have a Real number type

struct AFloat{T<:AbstractFloat} <: Real
    value::T
    j::Int 
end

I must track to perform automatic differentiation. Thus for any arithmetic operation I have to do some registrations in a tape. Performance is really important here (it makes a real difference if you have one more alloc per arithmetic op!). I have the choice between AFloat{T} or to simply use a primitive type to track the index j:

primitive type AFloat64 <: Real sizeof(Int) end

However, I am confused with these results:

First part: ok

using BenchmarkTools

struct A n::Int64 end

vA=A[A(1)];
@time push!(vA,A(2))

v=Int64[1];
@time push!(v,2)

returns

0.000011 seconds (6 allocations: 224 bytes)
0.000006 seconds (5 allocations: 208 bytes)

which is coherent with:

@btime push!(vA,A(2))
@btime push!(v,2)

that returns

  46.410 ns (1 allocation: 16 bytes)
  37.890 ns (0 allocations: 0 bytes)

-> I would conclude that pushing a primitive type avoid one memory allocation compared to a struct (is it right?)

Part two: ...problematic...?!

Here I am confused and I can not interpret these results:

foo_A() = A(1);             
foo_F64() = Float64(1.);                
foo_I64() = Int64(1);             

@time foo_A()
@time foo_F64()
@time foo_I64()

returns

  0.000004 seconds (5 allocations: 176 bytes)
  0.000005 seconds (5 allocations: 176 bytes)
  0.000005 seconds (4 allocations: 160 bytes)

Q1 how to interpret difference foo_F64() vs foo_I64() (5 allocs vs 4 allocs)?

Moreover, results seem inconsistent with @btime outputs:

@btime foo_A()
  3.179 ns (0 allocations: 0 bytes)

@btime foo_F64()
  3.801 ns (0 allocations: 0 bytes)

@btime foo_I64()
  3.180 ns (0 allocations: 0 bytes)

Q2: what is the right answer @time or @btime? Why?

To be synthetic, in Julia, is there a difference in terms of perf and memory allocation between foo_A and foo_Primitive, where:

struct A n::Int64 end

foo_A() = A(1)
foo_Primitive() = Int64(1)

I am aware that with such small expressions there are real risks of side effects when using @time or @btime. Ideally, it would be better to have some knowledge of Julia's internals to answer. But I don't

julia> versioninfo()
Julia Version 0.6.2
Commit d386e40c17 (2017-12-13 18:08 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-2603 v3 @ 1.60GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, haswell)

Julia code optimization, difference between structs and primitive types? (memory allocs)

Answers (1)

Related Questions