gTcV
gTcV

Reputation: 2504

Fast tensor initialisation in Julia

I would like to initialize a 3d tensor (multi-dimensional array) with the values of the "diagonal Gaussian"

exp(-32*(u^2 + 16*(v^2 + w^2)))

where u = 1/sqrt(3)*(x+y+z) and v,w are any two coordinates orthogonal to u, discretised on a uniform mesh on [-1,1]^3. The following code achieves this:

function gaussian3d(n)
    q = qr(ones(3,1), thin=false)[1]
    x = linspace(-1.,1., n)
    p = Array(Float64,(n,n,n))
    square(x) = x*x
    Base.@nloops 3 i p begin
        @inbounds p[i_1,i_2,i_3] = 
            exp(
                -32*(
                    square(q[1,1]*x[i_1] + q[2,1]*x[i_2] + q[3,1]*x[i_3])
                    + 16*(
                        square(q[1,2]*x[i_1] + q[2,2]*x[i_2] + q[3,2]*x[i_3]) + 
                        square(q[1,3]*x[i_1] + q[2,3]*x[i_2] + q[3,3]*x[i_3])
                    )
                )
            )
    end
    return p
end

It seems to be quite slow, however. For example, if I replace the defining function with exp(x*y*z), the code runs 50x faster. Also, the @time macro reports ~20% GC time for the above code which I do not understand where they come from. (These numeric values were obtained with n = 128.) My questions therefore are

Upvotes: 3

Views: 392

Answers (1)

rickhg12hs
rickhg12hs

Reputation: 11912

Knowing nothing of 3D tensors with values of the "diagonal Gaussian", using thesquare comment from the original post, "typing" q (@code_warntype helps here: Big performance jump!), and further specializing the @nloops, this works much faster on the platforms I tried it on.

julia> square(x::Float64) = x * x
square (generic function with 1 method)

julia> function my_gaussian3d(n)
           q::Array{Float64,2} = qr(ones(3,1), thin=false)[1]
           x = linspace(-1.,1., n)
           p = Array(Float64,(n,n,n))
           Base.@nloops 3 i p d->x_d=x[i_d] begin
               @inbounds p[i_1,i_2,i_3] =
                   exp(
                       -32*(
                           square(q[1,1]*x_1 + q[2,1]*x_2 + q[3,1]*x_3)
                           + 16*(
                               square(q[1,2]*x_1 + q[2,2]*x_2 + q[3,2]*x_3) +
                               square(q[1,3]*x_1 + q[2,3]*x_2 + q[3,3]*x_3)
                           )
                       )
                   )
           end
           return p
       end
my_gaussian3d (generic function with 1 method)

julia> @time gaussian3d(128);
elapsed time: 3.952389337 seconds (1264 MB allocated, 4.50% gc time in 57 pauses with 0 full sweep)

julia> @time gaussian3d(128);
elapsed time: 3.527316699 seconds (1264 MB allocated, 4.42% gc time in 58 pauses with 0 full sweep)

julia> @time my_gaussian3d(128);
elapsed time: 0.285837566 seconds (16 MB allocated)

julia> @time my_gaussian3d(128);
elapsed time: 0.28476448 seconds (16 MB allocated, 1.22% gc time in 0 pauses with 0 full sweep)

julia> my_gaussian3d(128) == gaussian3d(128)
true

Upvotes: 2

Related Questions