Reputation: 33
I want perform the following, in pseudocode:
(a, b, c) = (HashSet(...), HashSet(...), HashSet(...))
(a, b, c) = (a - b - c, b - a - c, c - a - b)
In Rust I tried something like this:
fn get_random_set(...) -> HashSet<String> {
...
}
// Sets of randomly generated "words" that define behavior of the whole program.
let action_plus: HashSet<String> = get_random_set();
let action_minus: HashSet<String> = get_random_set();
let action_new_line: HashSet<String> = get_random_set();
Now we are to exclude all the common "words" from these HashSet
s.
I learned that difference
and union
methods return Difference
and Union
iterators. And if I do this:
let action_plus = HashSet::from(action_minus.union(&action_new_line).collect::<Vec<String>>());
I receive this:
the trait `From<Vec<String>>` is not implemented for `HashSet<_, _>`
How to deal with this problem?
Upvotes: 3
Views: 668
Reputation: 602485
Your goal is to remove any element that's in more than one set from all the sets. Since you are only removing elements, you don't need to clone any strings, and you can perform all operations in place. If your sets are called a
, b
and c
, you can use this code:
a.retain(|x| b.remove(x) | c.remove(x));
b.retain(|x| c.remove(x));
This will first remove any element in the intersections between a
and b
as well as a
and c
, respectively, from both sets, and then all elements in the intersection between b
and c
. Overall, this should be much faster than the approaches discussed in the other answers (though I didn't do any benchmarks).
Note that it's important that the first line uses the non-shortcircuiting |
operator rather than ||
. The last operand, c.remove(x)
, obviously has a side effect, so we can't skip it. Otherwise, elements that are in all three sets would not be removed from c
.
Upvotes: 2
Reputation: 6165
Collect directly into HashSet
instead of Vec
using impl FromIterator for HashSet.
That is:
let action_plus: HashSet<_> = action_minus.union(&action_new_line).collect();
As Chayim said in the comments you could also use |
(BitOr
) operator, since HashSet
implements it and as the docs say it:
Returns the union of self and rhs as a new HashSet<T, S>
If you look at it's implementation you can see, that it does basically the same as the code above (although it clones items from those sets):
impl<T, S> BitOr<&HashSet<T, S>> for &HashSet<T, S>
where
T: Eq + Hash + Clone,
S: BuildHasher + Default,
{
type Output = HashSet<T, S>;
fn bitor(self, rhs: &HashSet<T, S>) -> HashSet<T, S> {
self.union(rhs).cloned().collect()
}
}
HasSet
also implements BitAnd (&
) operator for creating an intersection, Sub (-
) for creating a set difference and BitXor (^
) for creating a symmetric difference.
You can use those convenience operators, however they can be off-putting when one does not expect them, so use them carefully.
Upvotes: 5
Reputation: 12812
You can collect directly into a HashSet
.
let a = get_random_set();
let b = get_random_set();
let c = get_random_set();
let a_minus_b_minus_c = a
.difference(&b)
.cloned()
.collect::<HashSet<_>>()
.difference(&c)
.cloned()
.collect::<HashSet<_>>();
cloned
is necessary to convert the output from HashSet<&String>
to HashSet<String>
, since difference
requires sets of the same type. You don't need the last cloned
if you're okay with a_minus_b_minus_c
being HashSet<&String>
.
However, it's easier to just use the subtraction operator. HashSet
has an implementation of Sub
that takes both operands by reference, so you need to add a reference each time.
let a_minus_b_minus_c = &(&a - &b) - &c;
There's also:
Note that all these clone the values in the set and return a new set, so they're not optimally efficient. If you need more performance, you could do the operation manually by modifying the first set with retain
(difference, intersection) or extend
(union), or by manually making a new set that holds references.
Upvotes: 3