Jacopo Marconi
Jacopo Marconi

Reputation: 55

loop with array access is slow in Julia for high dimensional arrays

I have the same problem described and solved in this thread, where julia loops were considerably slowed down due to the repeated access to memory using a not optimal variable type. In my case however I have tensors (with dimensions higher than two) stored as multidimensional Array which need to be summed over and over in a for loop, something like:

function randomTensor(M,N)
    T = fill( zeros( M, M, M, M) , N)
    for i in 1 : N
        Ti = rand( M, M, M, M)
        T[i] += Ti
    end
    return T
end

I'm new to Julia (in fact I'm using it just for a specific project for which I need some julia modules), so I have only a general understanding of variable types and so forth, and till now I've not been able to define my tensors if not as general arrays. I would like a solution as the one proposed in the aforementioned link, where they use Matrix{Float64}[] to define variables, but apparently that works only for 1 or 2-dimensional arrays. Cannot use Tuples either, as I need to sum over the values. Any tip for different implementations?

Thank you all in advance.

Edit: the sample code I posted is just an example featuring the same structure of my code, I'm not interested in generating and summing random numbers : ) In other words, my question is: is there a better variable type than Array{Float, N} to do operations (e.g. tensor times scalar...) or that have "better access" to memory?

Upvotes: 2

Views: 388

Answers (2)

tholy
tholy

Reputation: 12179

Depending on the size of M, you might notice some advantage from using .+= instead of +=. The reason is that x += y is exactly equivalent to x = x + y, including allocating a new array to hold the result. In contrast, x .+= y is in-place. You can see that for yourself in the following way:

julia> a = [1,2,3]
3-element Vector{Int64}:
 1
 2
 3

julia> pointer(a)
Ptr{Int64} @0x00007fa007d2c980

julia> a += a
3-element Vector{Int64}:
 2
 4
 6

julia> pointer(a)
Ptr{Int64} @0x00007fa02a06eac0   # different pointer means new memory was allocated

julia> a .+= a
3-element Vector{Int64}:
  4
  8
 12

julia> pointer(a)
Ptr{Int64} @0x00007fa02a06eac0   # same pointer means memory re-use

Upvotes: 2

Kristoffer Carlsson
Kristoffer Carlsson

Reputation: 2838

It's quite unclear what the question is. Your function is equivalent to:

randomTensor(M,N) = [rand(M, M, M, M) for i in 1:N]

Are you saying that this is too slow? The majority of the time should be in creating the random numbers and allocating the memory for the output.

Upvotes: 4

Related Questions