Reputation: 21
I've spent the last month or so learning julia and I'm very impressed. In particular I'm analysing large amount of climate model output, I put all this into SharedArrays
and adjust and plot it all in parallel. So far it's very quick and efficient and I've got quite a library of code. My current problem is in creating a function that can do basic operations on two shared arrays. I've successfully written a function that takes two arrays and how you want to process them. The code is based around the example in the parallel section of the julia doc and uses the myrange
function as shown there
function myrange(q::SharedArray)
idx = indexpids(q)
#@show (idx)
if idx == 0
# This worker is not assigned a piece
return 1:0, 1:0
print("NO WORKERS ASSIGNED")
end
nchunks = length(procs(q))
splits = [round(Int, s) for s in linspace(0,length(q),nchunks+1)]
splits[idx]+1:splits[idx+1]
end
function combine_arrays_chunk!(array_1,array_2,output_array,func, length_range);
#@show (length_range)
for i in length_range
output_array[i] = func(array_1[i], array_2[i]);
#hardwired example for func = +
#output_array[i] = +(array_1[i], array_2[i]);
end
output_array
end
combine_arrays_shared_chunk!(array_1,array_2,output_array,func) = combine_arrays_chunk!(array_1,array_2,output_array,func, myrange(array_1));
function combine_arrays_shared(array_1::SharedArray,array_2::SharedArray,func)
if size(array_1)!=size(array_2)
return print("inputs not of the same size")
end
output_array=SharedArray(Float64,size(array_1));
@sync begin
for p in procs(array_1)
@async remotecall_wait(p, combine_arrays_shared_chunk!, array_1,array_2,output_array,func)
end
end
output_array
end
The works so one can do
strain_div = combine_arrays_shared(eps_1,eps_2,+);
strain_tot = combine_arrays_shared(eps_1,eps_2,hypot);
with the correct results an the output as a shared array as required. But ... it's quite slow. It's actually quicker to combine the sharedarray
as a normal array on one processor, calculate and then convert back to a sharedarray
(for my test cases anyway, with each array approx 200MB, when I move up to GBs I guess not). I can hardwire the combine_arrays_shared
function to only do addition (or some other function), and then you get the speed increase, but with function type being passed within combine_arrays_shared
the whole thing is slow (10 times slower than the hard wired addition).
I've looked at the FastAnonymous.jl
package but I can't see how it would work in this case. I tried, and failed. Any ideas?
I might just resort to writing a different combine_arrays_...
function for each basic function I use, or having the func
argument as a option and call different functions from within combine_arrays_shared
, but I want it to be more elegant! Also this is good way to learn more about Julia.
Harry
Upvotes: 2
Views: 96
Reputation: 12179
This question actually has nothing to do with SharedArrays, and is just "how do I pass functions-as-arguments and get better performance?"
The way FastAnonymous works---and similar to the way closures will work in julia soon---is to create a type with a call
method. If you're having trouble with FastAnonymous for some reason, you can always do it manually:
julia> immutable Foo end
julia> Base.call(f::Foo, x, y) = x*y
call (generic function with 1036 methods)
julia> function applyf(f, X)
s = zero(eltype(X))
for x in X
s += f(x, x)
end
s
end
applyf (generic function with 1 method)
julia> X = rand(10^6);
julia> f = Foo()
Foo()
# Run the function once with each type of argument to JIT-compile
julia> applyf(f, X)
333375.63216645207
julia> applyf(*, X)
333375.63216645207
# Compile anything used by @time
julia> @time 1
0.000004 seconds (148 allocations: 10.151 KB)
1
# Now let's benchmark
julia> @time applyf(f, X)
0.002860 seconds (5 allocations: 176 bytes)
333433.439233112
julia> @time applyf(*, X)
0.142411 seconds (4.00 M allocations: 61.035 MB, 19.24% gc time)
333433.439233112
Note the big increase in speed and greatly-reduced memory consumption.
Upvotes: 1