Jay Hacker
Jay Hacker

Reputation: 1915

Why would Scala Range iterator buffer -- sometimes?

In Scala 2.9.1, this works fine:

scala> (1 to Int.MaxValue).sum
res6: Int = -1073741824

Yet this runs out of heap space:

scala> (1 to Int.MaxValue).toIterator.sum
java.lang.OutOfMemoryError: GC overhead limit exceeded

But maddeningly, this works:

scala> (1 to Int.MaxValue).iterator.sum
res8: Int = -1073741824

Why should any of those be different?

Upvotes: 4

Views: 549

Answers (2)

Debilski
Debilski

Reputation: 67828

toIterator is defined in TraversableLike as

def toIterator: Iterator[A] = toStream.iterator

so it creates a Stream in the background which keeps all elements in memory while iterating.

(Edit: I think the stream structure isn’t the problem here actually. However, toStream itself calls toBuffer which in turn copies every single value.)

iterator on the other hand is defined in IndexedSeqLike which uses a specialised structure which does not keep any elements in memory.

Upvotes: 8

Austen Holmes
Austen Holmes

Reputation: 1929

If you take a closer look at the code, it's how everything is defined.

When you call toIterator, it takes everything in the sequence and copies it into an ArrayBuffer (by first trying to convert it to a stream.) This copy is likely what causes you to run out of memory.

When you use iterator, it creates an instance of a protected class Elements that returns a BufferedIterator. This uses the class itself to return elements.

protected class Elements(...) ... {
    ...
    def next: A = {
        if (index >= end)
            Iterator.empty.next

        val x = self(index)
        index += 1

        x
     }
}

Upvotes: 2

Related Questions