monster
monster

Reputation: 1782

Apache Spark Task not Serializable when Class exends Serializable

I am consistently having errors regarding Task not Serializable.

I have made a small Class and it extends Serializable - which is what I believe is meant to be the case when you need values in it to be serialised.

class SGD(filePath : String) extends Serializable {

 val rdd = sc.textFile(filePath)
 val mappedRDD = rdd.map(x => x.split(" ")
                    .slice(0,3))
                    .map(y => Rating(y(0).toInt, y(1).toInt, y(2).toDouble))
                    .cache

 val RNG = new Random(1)
 val factorsRDD = mappedRDD(x => (x.user, (x.product, x.rating)))
                 .groupByKey
                 .mapValues(listOfItemsAndRatings => 
                                             Vector(Array.fill(2){RNG.nextDouble}))
}

The final line always results in a Task not Serializable error. What I do not understand is: the Class is Serializable; and, the Class Random is also Serializable according to the API. So, what am I doing wrong? I consistently can't get stuff like this to work; therefore, I imagine my understanding is wrong. I keep being told the Class must be Serializable... well it is and it still doesn't work!?

Upvotes: 2

Views: 2393

Answers (1)

Shyamendra Solanki
Shyamendra Solanki

Reputation: 8851

scala.util.Random was not Serializable until 2.11.0-M2.

Most likely you are using an earlier version of Scala.

A class doesn't become Serializable until all its members are Serializable as well (or some other mechanism is provided to serialize them, e.g. transient or readObject/writeObject.)

I get the following stacktrace when running given example in spark-1.3:

Caused by: java.io.NotSerializableException: scala.util.Random
Serialization stack:
    - object not serializable (class: scala.util.Random, value: scala.util.Random@52bbf03d)
    - field (class: $iwC$$iwC$SGD, name: RNG, type: class scala.util.Random)

One way to fix it is to take instatiation of random variable within mapValues:

mapValues(listOfItemsAndRatings => { val RNG = new Random(1)
                   Vector(Array.fill(2)(RNG.nextDouble)) })

Upvotes: 7

Related Questions