Reputation: 45
I'm new with parallel computing and managed to change my code in such a way that it runs faster than my non parallel code, however my results slightly differ. I tried using the @sync, @async, @threads, and @distributed macro's but none of them seem to keep the outcome correct.
The goal is finding the shortest path through a maze and while running the code non parallel works fine it seems that running the code in parallel does not finish all iterations of the search algorithm and in turn only gives the shortest path through the maze found by one of the threads/workers.
It mainly works by starting at a the entrance, iterating over all possible points it can travel to towards the exit, it then selects the point it reached the quickest and repeats those steps until the exit is reached. It does this using two non-nested for loops but when i increase the execution speed using macros like @async or @distributed it never finds the shortest path.
Is there a way to do this concurrently or parallel while still getting the same result at the end?
Edit:
I added an example of a function that has the same problem. How would you be able to get the same z value in the end while speeding things up with parallelization?
a = rand(1, 2, 15)
function stufftodo(a)
z = 0
y = rand(size(a)[1], size(a)[2]) .* 1000
for i in 1:size(a)[3]
x = a[:, :, i]
sleep(0.05)
if sum(y)>sum(x)
y=x
end
end
z = minimum(y)
end
Upvotes: 0
Views: 555
Reputation: 42194
Here it is, any questions welcome:
Let us define the data for replicability of this example:
using Random
Random.seed!(0)
a = rand(1, 2, 15)
We start by measuring the timing of the original function. Since the first time Julia runs @time
it compiles the code we measure twice (only the second measurement is valid):
julia> @time stufftodo(a)
1.017913 seconds (654.90 k allocations: 33.067 MiB, 0.81% gc time)
0.042301665932029664
julia> @time stufftodo(a)
0.772471 seconds (538 allocations: 69.813 KiB)
0.042301665932029664
Now let us go distributed:
using Distributed
nworkers()<4 && addprocs(4)
@everywhere function stufftodo2(a)
y = rand(size(a)[1], size(a)[2]) .* 1000
aggfunc = (x,y) -> sum(y)>sum(x) ? x : y
y = @distributed (aggfunc) for i in 1:size(a)[3]
sleep(0.05)
a[:, :, i]
end
minimum(y)
end
Now we can measure the time of the distributed function (again only the second measurement is valid):
julia> @time stufftodo2(a)
2.819767 seconds (2.46 M allocations: 124.442 MiB, 1.53% gc time)
0.042301665932029664
julia> @time stufftodo2(a)
0.206596 seconds (447 allocations: 44.609 KiB)
0.042301665932029664
Upvotes: 1