Reputation: 342
Suppose we have Seq val ourSeq = Seq(10,5,3,5,4)
.
I want to return a new list which reads from the left and stop when it sees a duplicate number (e.g. Seq(10,5,3)
since 5 is repeated).
I was thinking of using fold left as such
ourSeq.foldLeft(Seq())(op = (temp, curr) => {
if (!temp.contains(curr)) {
temp :+ curr
} else break
})
but as far as I understand, there is no way to break out of a foldLeft
?
Upvotes: 2
Views: 2065
Reputation: 27421
This can be done using a recursive function:
def uniquePrefix[T](ourSeq: Seq[T]): List[T] = {
@annotation.tailrec
def loop(rem: List[T], res: List[T]): List[T] =
rem match {
case hd::tail if !res.contains(hd) =>
loop(tail, res :+ hd)
case _ =>
res
}
loop(ourSeq.toList, Nil)
}
This appears more complicated, but once you are familiar with the general pattern recursive functions are simple to write and more powerful than fold
operations.
If you are working on large collections, this version is more efficient because it is O(n)
:
def distinctPrefix[T](ourSeq: Seq[T]): List[T] = {
@annotation.tailrec
def loop(rem: List[T], found: Set[T], res: List[T]): List[T] =
rem match {
case hd::tail if !found.contains(hd) =>
loop(tail, found + hd, hd +: res)
case _ =>
res.reverse
}
loop(ourSeq.toList, Set.empty, Nil)
}
This version works with any Seq
and there are other options using Iterator
etc. as described in the comments. You would need to be more specific about the type of the collection in order to create an optimised algorithm.
def uniquePrefix[T](ourSeq: Seq[T]): List[T] = {
@annotation.tailrec
def loop(rem: Seq[T], res: List[T]): List[T] =
rem.take(1) match {
case Seq(hd) if !res.contains(hd) =>
loop(rem.drop(1), res :+ hd)
case _ =>
res
}
loop(ourSeq, Nil)
}
Upvotes: 3
Reputation: 8539
Another option you have, is to use the function inits
:
ourSeq.inits.dropWhile(curr => curr.distinct.size != curr.size).next()
Code run at Scastie.
Upvotes: -1
Reputation: 51271
Although it can be accomplished with a foldLeft()
without any breaking out, I would argue that fold
is the wrong tool for the job.
I'm rather fond of unfold()
, which was introduced in Scala 2.13.0.
val ourSeq = Seq(10,5,3,5,4)
Seq.unfold((Set.empty[Int],ourSeq)){ case (seen,ns) =>
Option.when(ns.nonEmpty && !seen(ns.head)) {
(ns.head, (seen+ns.head, ns.tail))
}
}
//res0: Seq[Int] = Seq(10, 5, 3)
Upvotes: 4
Reputation: 70397
You are correct that it's not possible to break out of foldLeft
. It would theoretically be possible to get the correct result with foldLeft
, but you're still going to iterate the whole data structure. It'll be better to use an algorithm that already understands how to terminate early, and since you want to take a prefix, takeWhile
will suffice.
import scala.collection.mutable.Set
val ourSeq = Seq(10, 5, 3, 5, 4)
val seen: Set[Int] = Set()
val untilDups = ourSeq.takeWhile((x) => {
if (seen contains x) {
false
} else {
seen += x
true
}
})
print(untilDups)
If you wanted to be totally immutable about this, you could wrap the whole thing in some kind of lazy fold that uses an immutable Set
to keep its data. And that's certainly how I'd do it in Haskell. But this is Scala; we have mutability, and we may as well use it locally when it suits us.
Upvotes: 3