Reputation: 7491
I am new to scala and you like to understand why the following code results
in a GC overhead limit exceeded
and what should be done to avoid it.
import scala.io.Source
import scala.annotation.tailrec
def getItems(file: Source): Stream[String] = {
@tailrec
def acc(it: Iterator[String],
item: String,
items: Stream[String]): Stream[String] = {
if(it.hasNext){
val line = it.next
line take 1 match {
case " " =>
acc(it, item + "\n" + line, items)
case "1" =>
acc(it, item, Stream.cons(item, items))
}
}
else {
Stream.cons(item, items)
}
}
acc(file.getLines(), "", Stream.Empty)
}
Upvotes: 0
Views: 249
Reputation: 4017
Stream
in scala is a leaky abstraction actually. It pretends to be a Seq
but you can't use it as a regular collection if a stream is huge.
Here is an article about streams http://blog.dmitryleskov.com/programming/scala/stream-hygiene-i-avoiding-memory-leaks/
In your case the rule 'don't store Streams in method arguments' is violated (items
).
Upvotes: 0
Reputation: 14217
There are two reasons of you code maybe will cause OOM:
item
will recursively add with the file length, this maybe will very large depend on your file size.Stream
is repeatedly appending the accumlated item
to Stream
, this also maybe will very large,that cause OOM.There is a way maybe can save this scenario by using lazy evaluation and Stream
without memorization
.
Upvotes: 1
Reputation: 111
I am trying to figure out what you are actually trying to do but the problem is that you are recursing with your acc function until your input file has not more elements. Here is a very simple example that converts your iterator into a stream.
def convert[T]( iter : Iterator[T] ) : Stream[T] =
if ( iter.hasNext ) {
Stream.cons( iter.next, convert( iter ) )
} else {
Stream.empty
}
In addition you are appending all lines that start with a space to item
. I don't know how many such lines you have in your input but if all lines would be starting with space, you would use (n^2)/2
characters if your input file has n characters. But I don't think that's why your recursion fails.
Upvotes: 0