Reputation: 413
I am required to pass a Set in Scala to a series of functions. I don't have a choice, it must be a Set.
I receive a really big List of JSON values:
scala.List[JsonAST.JValue]
The naive way would be to use toSet:
val input : scala.List[JsonAST.JValue]
val mySet = input.toSet
But this can be very expensive from a performance perspective. The way toSet works is it starts with the first element in the list, and then adds the following ones one-by-one. Each time, it scans the Set to ensure the value isn't already there.
Let's assume I know for a fact that there aren't any duplicates. Is there a way for me to create a Set more efficiently? Ideally, just copying the List into the Set. No changes in order or contents.
Upvotes: 0
Views: 223
Reputation: 32719
As ziggystar pointed out, you can only have O(1) performance when converting a List
to a Set
if you are willing to use an adapter, and then trade fast conversion (O(1)) to a set with poor performance when calling contains
(O(n)).
If that is an acceptable trade off for your use case, you can use something like this:
import scala.collection.AbstractSet
class SetAdapter[A](val self: Seq[A]) extends AbstractSet[A] {
def +(elem: A): SetAdapter[A] = if (self.contains(elem)) this else new SetAdapter(self :+ elem)
def -(elem: A): SetAdapter[A] = new SetAdapter(self.filterNot(_ == elem))
def contains(elem: A): Boolean = self.contains(elem)
// assumes that are not duplicates in self, otherwise it will
// incorrectly iterate several times over equal values, which a set is not supposed to do
def iterator: Iterator[A] = self.iterator
}
implicit class AsSetOps[A](val seq: Seq[A]) extends AnyVal {
def asSet() = new SetAdapter[A](seq)
}
Quick REPL test to check that it works as expected:
scala> val s = l.asSet
s: SetAdapter[Int] = Set(1, 3, 4, 8, 12)
scala> s + 6
res8: SetAdapter[Int] = Set(1, 3, 4, 8, 12, 6)
scala> s - 8
res9: SetAdapter[Int] = Set(1, 3, 4, 12)
In your example, all you'd have to do then is this:
val input : scala.List[JsonAST.JValue]
val mySet = input.asSet
Upvotes: 1
Reputation: 28690
mutable.HashSet
If the set is represented as a tree, which it most probably is, then the complexity of building the set is only O(n·log n)
. Otherwise use a mutable.HashSet
for building, which results in amortized O(n)
.
You can achieve this with list.to[mutable.HashSet]
.
O(1)
conversion (constant time)You cannot build a Set
from a List
in O(1)
, as you should have to visit each element at least once. So O(n)
is the fastest in a reasonable sense. You could have some Set
-wrapper around the List
. But then access is not fast anymore, as it delegates to List.elementAt()
.
Upvotes: 2