Reputation: 3456
Consider a following list transformation:
List(1,2,3,4) map (_ + 10) filter (_ % 2 == 0) map (_ * 3)
It is evaluated in the following way:
List(1, 2, 3, 4) map (_ + 10) filter (_ % 2 == 0) map (_ * 3)
List(11, 12, 13, 14) filter (_ % 2 == 0) map (_ * 3)
List(12, 14) map (_ * 3)
List(36, 42)
So there are three passes and with each one a new list structure created.
So, the first question: can Stream
help to avoid it and if yes -- how? Can all evaluations be made in a single pass and without additional structures created?
Isn't the following Stream evaluation model correct:
Stream(1, ?) map (_ + 10) filter (_ % 2 == 0) map (_ * 3)
Stream(11, ?) filter (_ % 2 == 0) map (_ * 3)
// filter condition fail, evaluate the next element
Stream(2, ?) map (_ + 10) filter (_ % 2 == 0) map (_ * 3)
Stream(12, ?) filter (_ % 2 == 0) map (_ * 3)
Stream(12, ?) map (_ * 3)
Stream(36, ?)
// finish
If it is, then there are the same number of passes and the same number of new Stream
structures created as in the case of a List
. If it is not -- then the second question: what is Stream
evaluation model in particularly this type of transformation chain?
Upvotes: 1
Views: 660
Reputation: 1725
Scala can filter and transform a collection in a variety of ways.
First your example:
List(1,2,3,4) map (_ + 10) filter (_ % 2 == 0) map (_ * 3)
Could be optimized:
List(1,2,3,4) filter (_ % 2 == 0) map (v => (v+10)*3)
Or, folds could be used:
List(1,2,3,4).foldLeft(List[Int]()){ case (a,b) if b % 2 == 0 => a ++ List((b+10)*3) case (a,b) => a }
Or, perhaps a for-expression:
for( v <- List(1,2,3,4); w=v+10 if w % 2 == 0 ) yield w*3
Or, maybe the clearest to understand, a collection:
List(1,2,3,4).collect{ case v if v % 2 == 0 => (v+10)*3 }
But to address your questions about Streams; Yes, streams can be used and for large collections where what is wanted is often found early, a Stream is a good choice:
def myStream( s:Stream[Int] ): Stream[Int] =
((s.head+10)*3) #:: myStream(s.tail.filter( _ % 2 == 0 ))
myStream(Stream.from(2)).take(2).toList // An infinitely long list yields
// 36 & 42 where the 3rd element
// has not been processed yet
With this Stream example the filter is only applied to the next element as it is needed, not to the entire list -- good thing, or it would never stop :)
Upvotes: 0
Reputation: 51271
One way to avoid intermediate collections is to use view
.
List(1,2,3,4).view map (_ + 10) filter (_ % 2 == 0) map (_ * 3)
It doesn't avoid every intermediate, but it can be useful. This page has lots of info and is well worth the time.
Upvotes: 3
Reputation: 18869
No, you can't avoid it by using Stream
.
But you do can avoid it by using the method collect
, and you should keep the idea that everytime you use a map
after filter
you may need a collect
.
Here is the code:
scala> def time(n: Int)(call : => Unit): Long = {
| val start = System.currentTimeMillis
| var cnt = n
| while(cnt > 0) {
| cnt -= 1
| call
| }
| System.currentTimeMillis - start
| }
time: (n: Int)(call: => Unit)Long
scala> val xs = List.fill(10000)((math.random * 100).toInt)
xs: List[Int] = List(37, 86, 74, 1, ...)
scala> val ys = Stream(xs :_*)
ys: scala.collection.immutable.Stream[Int] = Stream(37, ?)
scala> time(10000){ xs map (_+10) filter (_%2 == 0) map (_*3) }
res0: Long = 7182
//Note call force to evaluation of the whole stream.
scala> time(10000){ ys map (_+10) filter (_%2 == 0) map (_*3) force }
res1: Long = 17408
scala> time(10000){ xs.view map (_+10) filter (_%2 == 0) map (_*3) force }
res2: Long = 6322
scala> time(10000){ xs collect { case x if (x+10)%2 == 0 => (x+10)*3 } }
res3: Long = 2339
Upvotes: 2
Reputation: 7162
As far as I know, If you always iterate through the whole collection Stream does not help you.
It will create the same number as new Streams
as with the List
.
Correct me if I am wrong, but I understand it as follows:
Stream is a lazy structure, so when you do:
val result = Stream(1, ?) map (_ + 10) filter (_ % 2 == 0) map (_ * 3)
the result is another stream that links to the results of the previous transformations. So if force evaluation with a foreach (or e.g. mkString
)
result.foreach(println)
for each iteration the above chain is evaluated to get the current item.
However, you can reduce passes by 1, if you replace filter with withFilter
. Then the filter is kind of applied with the map function.
List(1,2,3,4) map (_ + 10) withFilter (_ % 2 == 0) map (_ * 3)
You can reduce it to one pass with flatMap:
List(1,2,3,4) flatMap { x =>
val y = x + 10
if (y % 2 == 0) Some(y * 3) else None
}
Upvotes: 0