Pedro G.
Pedro G.

Reputation: 405

Julia: SharedArray with remote workers becomes a 0-element array

I'm trying to run some code using remote workers on a server that I would like to combine with local workers on Julia 1.5.3. The following code works fine when run locally with 24 workers:

using Distributed
using SharedArrays
a = SharedArray{Float64}(100)
@sync @distributed for i = 1:100
    a[i] = i+1
end
sum(a)

If I add workers with

N_remote = 24
for i=1:N_remote
    addprocs(["[email protected]"], tunnel=true, dir="/home/user/scripts/", exename="/home/user/julia-1.5.3/bin/julia")
end

Then I get the following error when running the first code:

 julia> include("test_sharedarray.jl")
ERROR: LoadError: TaskFailedException:
On worker 4:
BoundsError: attempt to access 0-element Array{Float64,1} at index [1]
setindex! at ./array.jl:847 [inlined]
setindex! at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/SharedArrays/src/SharedArrays.jl:510
macro expansion at /home/usuaris/spcom/gfebrer/bayesian_mc_watson/scripts/test_sharedarray.jl:5 [inlined]
#13 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/macros.jl:301
#160 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/macros.jl:87
#103 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:290
run_work_thunk at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:79
run_work_thunk at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:88
#96 at ./task.jl:356

...and 23 more exception(s).

Stacktrace:
 [1] sync_end(::Channel{Any}) at ./task.jl:314
 [2] (::Distributed.var"#159#161"{var"#13#14",UnitRange{Int64}})() at ./task.jl:333
Stacktrace:
 [1] sync_end(::Channel{Any}) at ./task.jl:314
 [2] top-level scope at task.jl:333
 [3] include(::String) at ./client.jl:457
 [4] top-level scope at REPL[5]:1
in expression starting at /home/user/scripts/test_sharedarray.jl:4

Upvotes: 2

Views: 181

Answers (1)

Przemyslaw Szufel
Przemyslaw Szufel

Reputation: 42194

SharedArrays works only within a single cluster node. In other words this is used to share RAM memory between processes running on the same server. When you add another server obviously you will not see that memory.

What you should do is to use DistributedArrays.jl instead:

using Distributed, DistributedArrays
addprocs(2)
@everywhere using DistributedArrays
a=dzeros((3,4),workers())
@sync @distributed for i = 1:nworkers()
    a_part = localpart(a) 
    vec(a_part) .= (1:length(a_part)) .+ 1000*myid()
end

And let us now see a:

julia> a
3×4 DArray{Float64,2,Array{Float64,2}}:
 2001.0  2004.0  3001.0  3004.0
 2002.0  2005.0  3002.0  3005.0
 2003.0  2006.0  3003.0  3006.0

Upvotes: 6

Related Questions