Reputation: 21
I'm trying to distribute a function that outputs a vector into an array.
I followed this post with something like the following code:
a = distribute([Float64[] for _ in 1:nrow(df)])
@sync @distributed for i in 1:nrow(df)
append!(localpart(a)[i], foo(df[i]))
end
But I get the following error:
BoundsError: attempt to access 145-element Vector{Vector{Float64}} at index [147]
I've only ever parallelized with SharedArrays, which aren't an option, since I need to store vectors in the shared array. Any and all advice would be life-saving.
Upvotes: 2
Views: 46
Reputation: 42244
Each localpart is indexed starting from one. Hence you need to convert between a global index and the local index.
The function DistributedArrays.localindices()
returns a one element tuple that contains a range of global indices that are mapped to the localpart.
This information can be in turn used for the index conversion:
@sync @distributed for i in 1:nrow(df)
id = i - DistributedArrays.localindices(a)[1][1] + 1
push!(localpart(a)[id], f(df[i,1]))
end
to understand how localindiced
work look at this code:
julia> addprocs(4);
julia> a = distribute([Float64[] for _ in 1:14]);
julia> fetch(@spawnat 2 DistributedArrays.localindices(a))
(1:4,)
julia> fetch(@spawnat 3 DistributedArrays.localindices(a))
(5:8,)
julia> fetch(@spawnat 4 DistributedArrays.localindices(a))
(9:11,)
julia> fetch(@spawnat 5 DistributedArrays.localindices(a))
(12:14,)
Upvotes: 1