Reputation: 1267
I would like to change the following function to parallel processing.
fn cal_v(c: &Array3<f32>, dt: f32) -> Array3<f32> {
let mut v: Array3<f32> = Array::zeros(c.raw_dim());
for time in 1..c.shape()[0] {
let before = c.slice(s![time-1, .., ..]).to_owned();
let now = c.slice(s![time, .., ..]).to_owned();
let now_v = (before - now) / dt;
v.slice_mut(s![time, .., ..]).assign(&now_v);
}
v
}
The function c is a 3-dimensional array where the first dimension is the index for time, the second dimension is the index for the particle number, and the third dimension is x,y,z.
Personally, I want to learn how to write multi-threaded code,
I want to learn how to make it multithreaded, so I want to do the for time in 1..c.shape()[0]
part multithreaded.
I believe that this process can be parallelized or made asynchronous because there is no dependency in the process at any index of for
.
What is the best way to write this?
The version of the compiler I'm using is 1.51.0
.
The version of the library I'm using is ndarray(0.14.0)
.
Upvotes: 0
Views: 106
Reputation: 42197
The simplest way to parallelize an iteration is to just use rayon, that's basically its bread and butter. You will have to convert the loop to a more functional style (using for_each
for the final loop body), but with that done par_iter()
will "magically" distribute the work over its threadpool.
There is an issue I'm not too clear about though because you don't provide any of the code:
v.slice_mut(s![time, .., ..]).assign(&now_v);
is a hard sequential dependency between all the items as-is, and I've no idea what the s
macro does, so it may or may not be workable. If each time
yields a single slot from v
(or at least non-overlapping slots) you would be able to zip
the iterators together such that each iteration has a "source" and a "destination" independent from all the others, otherwise you'd have a sequential choke-point in the output, which isn't something rayon supports well.
Upvotes: 1