Reputation: 405
I'm trying to run some code using remote workers on a server that I would like to combine with local workers on Julia 1.5.3. The following code works fine when run locally with 24 workers:
using Distributed
using SharedArrays
a = SharedArray{Float64}(100)
@sync @distributed for i = 1:100
a[i] = i+1
end
sum(a)
If I add workers with
N_remote = 24
for i=1:N_remote
addprocs(["[email protected]"], tunnel=true, dir="/home/user/scripts/", exename="/home/user/julia-1.5.3/bin/julia")
end
Then I get the following error when running the first code:
julia> include("test_sharedarray.jl")
ERROR: LoadError: TaskFailedException:
On worker 4:
BoundsError: attempt to access 0-element Array{Float64,1} at index [1]
setindex! at ./array.jl:847 [inlined]
setindex! at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/SharedArrays/src/SharedArrays.jl:510
macro expansion at /home/usuaris/spcom/gfebrer/bayesian_mc_watson/scripts/test_sharedarray.jl:5 [inlined]
#13 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/macros.jl:301
#160 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/macros.jl:87
#103 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:290
run_work_thunk at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:79
run_work_thunk at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:88
#96 at ./task.jl:356
...and 23 more exception(s).
Stacktrace:
[1] sync_end(::Channel{Any}) at ./task.jl:314
[2] (::Distributed.var"#159#161"{var"#13#14",UnitRange{Int64}})() at ./task.jl:333
Stacktrace:
[1] sync_end(::Channel{Any}) at ./task.jl:314
[2] top-level scope at task.jl:333
[3] include(::String) at ./client.jl:457
[4] top-level scope at REPL[5]:1
in expression starting at /home/user/scripts/test_sharedarray.jl:4
Upvotes: 2
Views: 181
Reputation: 42194
SharedArrays works only within a single cluster node. In other words this is used to share RAM memory between processes running on the same server. When you add another server obviously you will not see that memory.
What you should do is to use DistributedArrays.jl
instead:
using Distributed, DistributedArrays
addprocs(2)
@everywhere using DistributedArrays
a=dzeros((3,4),workers())
@sync @distributed for i = 1:nworkers()
a_part = localpart(a)
vec(a_part) .= (1:length(a_part)) .+ 1000*myid()
end
And let us now see a
:
julia> a
3×4 DArray{Float64,2,Array{Float64,2}}:
2001.0 2004.0 3001.0 3004.0
2002.0 2005.0 3002.0 3005.0
2003.0 2006.0 3003.0 3006.0
Upvotes: 6