Reputation: 363
I am trying to create a HashMap
using functional programming and utilizing parallelization by rayon
.
If I try this without rayon
, it works:
use std::collections::HashMap;
fn main() {
let nums = [1, 2, 1, 2, 1, 2];
let result: HashMap<i32, i32> =
nums.iter()
.filter(|x| *x % 2 == 0)
.fold(HashMap::new(), |mut acc, x| {
*acc.entry(*x).or_insert(0) += 1;
acc
});
println!("{:?}", result);
}
If I try to use multiple cores by switching from iter()
to par_iter()
, I get an error:
use rayon::prelude::*; // 1.5.1
use std::collections::HashMap;
fn main() {
let nums = [1, 2, 1, 2, 1, 2];
let result: HashMap<i32, i32> =
nums.par_iter()
.filter(|x| *x % 2 == 0)
.fold(HashMap::new(), |mut acc, x| {
*acc.entry(*x).or_insert(0) += 1;
acc
});
println!("{:?}", result);
}
error[E0277]: expected a `Fn<()>` closure, found `HashMap<_, _>`
--> src/main.rs:9:19
|
9 | .fold(HashMap::new(), |mut acc, x| {
| ^^^^^^^^^^^^^^ expected an `Fn<()>` closure, found `HashMap<_, _>`
|
= help: the trait `Fn<()>` is not implemented for `HashMap<_, _>`
= note: wrap the `HashMap<_, _>` in a closure with no arguments: `|| { /* code */ }`
error[E0308]: mismatched types
--> src/main.rs:7:9
|
6 | let result: HashMap<i32, i32> =
| ----------------- expected due to this
7 | / nums.par_iter()
8 | | .filter(|x| *x % 2 == 0)
9 | | .fold(HashMap::new(), |mut acc, x| {
10 | | *acc.entry(*x).or_insert(0) += 1;
11 | | acc
12 | | });
| |______________^ expected struct `HashMap`, found struct `Fold`
|
= note: expected struct `HashMap<i32, i32>`
found struct `Fold<rayon::iter::Filter<rayon::slice::Iter<'_, {integer}>, [closure@src/main.rs:8:21: 8:36]>, HashMap<_, _>, _>`
Obviously, Rust tries to stop me from doing something stupid involving race conditions, but how would I build a HashMap
inside a par_iter()
?
Upvotes: 2
Views: 1089
Reputation: 42678
Rayon's fold creates intermediate items (cannot be known how many). From the documentation (emphasis mine):
Parallel fold is similar to sequential fold except that the sequence of items may be subdivided before it is folded. Consider a list of numbers like
22 3 77 89 46
. If you used sequential fold to add them (fold(0, |a,b| a+b)
, you would wind up first adding 0 + 22, then 22 + 3, then 25 + 77, and so forth. The parallel fold works similarly except that it first breaks up your list into sublists, and hence instead of yielding up a single sum at the end, it yields up multiple sums. The number of results is nondeterministic, as is the point where the breaks occur.
You need to reduce those intermediate items into the final one:
use rayon::prelude::*; // 1.5.1
use std::collections::HashMap;
fn main() {
let nums = [1, 2, 1, 2, 1, 2];
let result: HashMap<i32, i32> = nums
.par_iter()
.filter(|x| *x % 2 == 0)
.fold(HashMap::new, |mut acc, x| {
*acc.entry(*x).or_insert(0) += 1;
acc
})
.reduce_with(|mut m1, m2| {
for (k, v) in m2 {
*m1.entry(k).or_default() += v;
}
m1
})
.unwrap();
println!("{:?}", result);
}
Also note that the first parameter of Rayon's fold
is a function that creates the empty HashMap
, not an empty HashMap
like the standard library's fold
.
Upvotes: 2