Regalia9363
Regalia9363

Reputation: 342

Scala: Breaking out of foldLeft

Suppose we have Seq val ourSeq = Seq(10,5,3,5,4).

I want to return a new list which reads from the left and stop when it sees a duplicate number (e.g. Seq(10,5,3) since 5 is repeated).

I was thinking of using fold left as such

ourSeq.foldLeft(Seq())(op = (temp, curr) => {

  if (!temp.contains(curr)) {
    temp :+ curr 
  } else break

})

but as far as I understand, there is no way to break out of a foldLeft?

Upvotes: 2

Views: 2065

Answers (4)

Tim
Tim

Reputation: 27421

This can be done using a recursive function:

def uniquePrefix[T](ourSeq: Seq[T]): List[T] = {
  @annotation.tailrec
  def loop(rem: List[T], res: List[T]): List[T] = 
    rem match {
      case hd::tail if !res.contains(hd) =>
        loop(tail, res :+ hd)
      case _ =>
        res
    }

  loop(ourSeq.toList, Nil)
}

This appears more complicated, but once you are familiar with the general pattern recursive functions are simple to write and more powerful than fold operations.

If you are working on large collections, this version is more efficient because it is O(n):

def distinctPrefix[T](ourSeq: Seq[T]): List[T] = {
  @annotation.tailrec
  def loop(rem: List[T], found: Set[T], res: List[T]): List[T] = 
    rem match {
      case hd::tail if !found.contains(hd) =>
        loop(tail, found + hd, hd +: res)
      case _ =>
        res.reverse
    }

  loop(ourSeq.toList, Set.empty, Nil)
}

This version works with any Seq and there are other options using Iterator etc. as described in the comments. You would need to be more specific about the type of the collection in order to create an optimised algorithm.

def uniquePrefix[T](ourSeq: Seq[T]): List[T] = {
  @annotation.tailrec
  def loop(rem: Seq[T], res: List[T]): List[T] = 
    rem.take(1) match {
      case Seq(hd) if !res.contains(hd) =>
        loop(rem.drop(1), res :+ hd)
      case _ =>
        res
    }

  loop(ourSeq, Nil)
}

Upvotes: 3

Tomer Shetah
Tomer Shetah

Reputation: 8539

Another option you have, is to use the function inits:

ourSeq.inits.dropWhile(curr => curr.distinct.size != curr.size).next()

Code run at Scastie.

Upvotes: -1

jwvh
jwvh

Reputation: 51271

Although it can be accomplished with a foldLeft() without any breaking out, I would argue that fold is the wrong tool for the job.

I'm rather fond of unfold(), which was introduced in Scala 2.13.0.

val ourSeq = Seq(10,5,3,5,4)
Seq.unfold((Set.empty[Int],ourSeq)){ case (seen,ns) =>
  Option.when(ns.nonEmpty && !seen(ns.head)) {
    (ns.head, (seen+ns.head, ns.tail))
  }
}
//res0: Seq[Int] = Seq(10, 5, 3)

Upvotes: 4

Silvio Mayolo
Silvio Mayolo

Reputation: 70397

You are correct that it's not possible to break out of foldLeft. It would theoretically be possible to get the correct result with foldLeft, but you're still going to iterate the whole data structure. It'll be better to use an algorithm that already understands how to terminate early, and since you want to take a prefix, takeWhile will suffice.

import scala.collection.mutable.Set

val ourSeq = Seq(10, 5, 3, 5, 4)

val seen: Set[Int] = Set()
val untilDups = ourSeq.takeWhile((x) => {
  if (seen contains x) {
    false
  } else {
    seen += x
    true
  }
})
print(untilDups)

If you wanted to be totally immutable about this, you could wrap the whole thing in some kind of lazy fold that uses an immutable Set to keep its data. And that's certainly how I'd do it in Haskell. But this is Scala; we have mutability, and we may as well use it locally when it suits us.

Upvotes: 3

Related Questions