Reputation: 757
I would like to calculate the summation of elements from a large upper triangular matrix. The regular Julia code is below.
function upsum(M); n = size(M)[1]; sum = 0
for i = 1:n-1 for j = i+1:n
sum = sum + M[i,j]
end
end
return sum
end
R = randn(10000,10000)
upsum(R)
Since the matrix is very large, I would like to know is there anyway to improve the speed. How can I use parallel computing here?
Upvotes: 2
Views: 1212
Reputation: 69819
I would use threads not parallel processing in this case. Here is an example code:
using Base.Threads
function upsum_threads(M)
n = size(M, 1)
chunks = nthreads()
sums = zeros(eltype(M), chunks)
chunkend = round.(Int, n * sqrt.((1:chunks) ./ chunks))
@assert minimum(diff(chunkend)) > 0
chunkstart = [2; chunkend[1:end-1] .+ 1]
@threads for job in 1:chunks
s = zero(eltype(M))
for i in chunkstart[job]:chunkend[job]
@simd for j in 1:(i-1)
@inbounds s += M[j, i]
end
end
sums[job] = s
end
return sum(sums)
end
R = randn(10000,10000)
upsum_threads(R)
It should give you a significant speedup (even if you remove @threads
it should be much faster).
You choose number of threads Julia uses by setting JULIA_NUM_THREADS
environment variable.
Upvotes: 4