Reputation: 4927
I found myself lately using sliding(n,n) when I need to iterate collections in groups of n elements without re-processing any of them. I was wondering if it would be more correct to iterate those collections by using grouped(n). My question is if there is an special reason to use one or another for this specific case in terms of performance.
val listToGroup = List(1,2,3,4,5,6,7,8)
listToGroup: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8)
listToGroup.sliding(3,3).toList
res0: List[List[Int]] = List(List(1, 2, 3), List(4, 5, 6), List(7, 8))
listToGroup.grouped(3).toList
res1: List[List[Int]] = List(List(1, 2, 3), List(4, 5, 6), List(7, 8))
Upvotes: 33
Views: 32493
Reputation: 2506
import scala.util.Random
val measurements: Vector[Long] = {
val random = new Random(2021)
(1 to 100_000_000).map(_ => 10 + random.nextLong(90)).toVector
}
def time(block: => Unit): Long = {
val start = System.nanoTime()
block
val end = System.nanoTime()
(end - start) / 1000 / 1000
}
scala> time {measurements.grouped(2).collect { case Seq(a, b) if a < b => b }.size}
val res38: Long = 2367
scala> time {measurements.sliding(2,1).collect { case Seq(a, b) if a < b => b }.size}
val res39: Long = 5174
sliding
is more than 2x slower.
Scala3, macbook, Apple M1 Pro, 16 GB
Upvotes: 0
Reputation: 1936
The reason to use sliding
instead of grouped
is really only applicable when you want to have the 'windows' be of a length different than what you 'slide' by (that is to say, using sliding(m, n)
where m != n
):
listToGroup.sliding(2,3).toList
//returns List(List(1, 2), List(4, 5), List(7, 8))
listToGroup.sliding(4,3).toList
//returns List(List(1, 2, 3, 4), List(4, 5, 6, 7), List(7, 8))
As som-snytt points out in a comment, there's not going to be any performance difference, as both of them are implemented within Iterator
as returning a new GroupedIterator
. However, it's simpler to write grouped(n)
than sliding(n, n)
, and your code will be cleaner and more obvious in its intended behavior, so I would recommend grouped(n)
.
As an example for where to use sliding
, consider this problem where grouped
simply doesn't suffice:
Given a list of numbers, find the sublist of length 4 with the greatest sum.
Now, putting aside the fact that a dynamic programming approach can produce a more efficient result, this can be solved as:
def maxLengthFourSublist(list: List[Int]): List[Int] = {
list.sliding(4,1).maxBy(_.sum)
}
If you were to use grouped
here, you wouldn't get all the sublists, so sliding
is more appropriate.
Upvotes: 36