Reputation: 3922
Lets assume we have a Scala list:
val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
We can easily remove duplicates using the following code:
l1.distinct
or
l1.toSet.toList
But what if we want to remove duplicates only if there are more than 2 of them? So if there are more than 2 elements with the same value we remain only two and remove the rest of them.
I could achieve it with following code:
l1.groupBy(identity).mapValues(_.take(2)).values.toList.flatten
that gave me the result:
List(2, 2, 5, 1, 1, 3, 3)
Elements are removed but the order of remaining elements is different from how these elements appeared in the initial list. How to do this operation and remain the order from original list?
So the result for l1 should be:
List(1, 2, 3, 1, 3, 2, 5)
Upvotes: 18
Views: 2670
Reputation: 1512
I prefer approach with immutable Map
:
def noMoreThan[T](list: List[T], max: Int): List[T] = {
def go(tail: List[T], freq: Map[T, Int]): List[T] = {
tail match {
case h :: t =>
if (freq(h) < max)
h :: go(t, freq + (h -> (freq(h) + 1)))
else go(t, freq)
case _ => Nil
}
}
go(list, Map[T, Int]().withDefaultValue(0))
}
Upvotes: 1
Reputation: 22374
Solution with groupBy
and filter, without any sorting (so it's O(N), sorting will give you additional O(Nlog(N)) in typical case):
val li = l1.zipWithIndex
val pred = li.groupBy(_._1).flatMap(_._2.lift(1)) //1 is your "2", but - 1
for ((x, i) <- li if !pred.get(x).exists(_ < i)) yield x
Upvotes: 1
Reputation: 213
Similar to how distinct is implemeted, with a multiset instead of a set:
def noMoreThan[T](list : List[T], max : Int) = {
val b = List.newBuilder[T]
val seen = collection.mutable.Map[T,Int]().withDefaultValue(0)
for (x <- list) {
if (seen(x) < max) {
b += x
seen(x) += 1
}
}
b.result()
}
Upvotes: 5
Reputation: 12123
distinct
is defined for SeqLike
as
/** Builds a new $coll from this $coll without any duplicate elements.
* $willNotTerminateInf
*
* @return A new $coll which contains the first occurrence of every element of this $coll.
*/
def distinct: Repr = {
val b = newBuilder
val seen = mutable.HashSet[A]()
for (x <- this) {
if (!seen(x)) {
b += x
seen += x
}
}
b.result()
}
We can define our function in very similar fashion:
def distinct2[A](ls: List[A]): List[A] = {
val b = List.newBuilder[A]
val seen1 = mutable.HashSet[A]()
val seen2 = mutable.HashSet[A]()
for (x <- ls) {
if (!seen2(x)) {
b += x
if (!seen1(x)) {
seen1 += x
} else {
seen2 += x
}
}
}
b.result()
}
scala> distinct2(l1)
res4: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
This version uses internal state, but is still pure. It is also quite easy to generalise for arbitrary n (currently 2), but specific version is more performant.
You can implement the same function with folds carrying the "what is seen once and twice" state with you. Yet the for loop and mutable state does the same job.
Upvotes: 4
Reputation: 8227
Here is canonical Scala code to do reduce three or more in a row to two in a row:
def checkForTwo(candidate: List[Int]): List[Int] = {
candidate match {
case x :: y :: z :: tail if x == y && y == z =>
checkForTwo(y :: z :: tail)
case x :: tail =>
x :: checkForTwo(tail)
case Nil =>
Nil
}
}
It looks at the first three elements of the list, and if they are the same, drops the first one and repeats the process. Otherwise, it passes items on through.
Upvotes: 1
Reputation: 11741
Not the most efficient.
scala> val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
l1: List[Int] = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
scala> l1.zipWithIndex.groupBy( _._1 ).map(_._2.take(2)).flatten.toList.sortBy(_._2).unzip._1
res10: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
Upvotes: 11
Reputation: 31724
Its a bit ugly, but its relatively faster
val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
l1.foldLeft((Map[Int, Int](), List[Int]())) { case ((m, ls), x) => {
val z = m + ((x, m.getOrElse(x, 0) + 1))
(z, if (z(x) <= 2) x :: ls else ls)
}}._2.reverse
Gives: List(1, 2, 3, 1, 3, 2, 5)
Upvotes: 2
Reputation: 40500
How about this:
list
.zipWithIndex
.groupBy(_._1)
.toSeq
.flatMap { _._2.take(2) }
.sortBy(_._2)
.map(_._1)
Upvotes: 3
Reputation: 41759
More straightforward version using foldLeft:
l1.foldLeft(List[Int]()){(acc, el) =>
if (acc.count(_ == el) >= 2) acc else el::acc}.reverse
Upvotes: 6
Reputation: 992
My humble answer:
def distinctOrder[A](x:List[A]):List[A] = {
@scala.annotation.tailrec
def distinctOrderRec(list: List[A], covered: List[A]): List[A] = {
(list, covered) match {
case (Nil, _) => covered.reverse
case (lst, c) if c.count(_ == lst.head) >= 2 => distinctOrderRec(list.tail, covered)
case _ => distinctOrderRec(list.tail, list.head :: covered)
}
}
distinctOrderRec(x, Nil)
}
With the results:
scala> val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
l1: List[Int] = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
scala> distinctOrder(l1)
res1: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
On Edit: Right before I went to bed I came up with this!
l1.foldLeft(List[Int]())((total, next) => if (total.count(_ == next) >= 2) total else total :+ next)
With an answer of:
res9: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
Upvotes: 7
Reputation: 1786
Based on experquisite's answer, but using foldLeft
:
def noMoreThanBis(xs: List[Int], max: Int) = {
val initialState: (Map[Int, Int], List[Int]) = (Map().withDefaultValue(0), Nil)
val (_, result) = xs.foldLeft(initialState) { case ((count, res), x) =>
if (count(x) >= max)
(count, res)
else
(count.updated(x, count(x) + 1), x :: res)
}
result.reverse
}
Upvotes: 4
Reputation: 879
Not the prettiest. I look forward to seeing the other solutions.
def noMoreThan(xs: List[Int], max: Int) =
{
def op(m: Map[Int, Int], a: Int) = {
m updated (a, m(a) + 1)
}
xs.scanLeft( Map[Int,Int]().withDefaultValue(0) ) (op).tail
.zip(xs)
.filter{ case (m, a) => m(a) <= max }
.map(_._2)
}
scala> noMoreThan(l1, 2)
res0: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
Upvotes: 6
Reputation: 9100
Here is a recursive solution (it will stack overflow for large lists):
def filterAfter[T](l: List[T], max: Int): List[T] = {
require(max > 1)
//keep the state of seen values
val seen = Map[T, Int]().withDefaultValue(0)//init to 0
def filterAfter(l: List[T], seen: Map[T, Int]): (List[T], Map[T, Int]) = {
l match {
case x :: xs =>
if (seen(x) < max) {
//Update the state and pass to next
val pair = filterAfter(xs, seen updated (x, seen(x) + 1))
(x::pair._1, pair._2)
} else {
//already seen more than max
filterAfter(xs, seen)
}
case _ => (l, seen)//empty, terminate recursion
}
}
//call inner recursive function
filterAfter(l, seen, 2)._1
}
Upvotes: 1