Reputation: 1692
I am currently doing this:
val count = sightings.map(_.shape).distinct.length
However, map
creates an intermediary collection, which in my case is a Vector thousands of times larger than what distinct
produces.
How do I bypass this intermediate step and get the set of distinct shapes? Or, even better, the count of distinct shapes.
Upvotes: 1
Views: 543
Reputation: 12814
You can use an iterator to not create the intermediate collection and then accrue the shapes in a Set
to get the distinct ones:
val count = sightings.iterator.map(_.shape).toSet.size
Alternatively, you can use collection.breakOut
to accrue the items in a Set
without creating the intermediate collection (another answer suggested using breakOut
, but in a different way):
val distinctShapes: Set[Shape] = sightings.map(_.shape)(collection.breakOut)
val count = distinctShapes.size
Upvotes: 4
Reputation: 1251
Apart from the other answers, there is an exact solution for your problem.
Breakout
is the key you are looking for.
Example usage:
import scala.collection.breakOut
val count = sightings.map(_.shape)(breakOut).distinct.length
Here, using breakOut
prevents creating intermediate collections.
You can read documentation for more information.
Upvotes: 2
Reputation: 51271
One approach is to remove the duplicates as you go, then count the results.
sightings.foldLeft(Set[Shape]()){case (ss,sight) => ss + sight.shape}.size
The intermediate Set
of shapes is only as big as all the distinct shapes encountered so far.
Upvotes: 3