Task not serializable when class is serializable

Question

I have the following class in Scala

case class A
  (a:Int,b:Int) extends Serializable

when I try in Spark 2.4. (via Databricks)

val textFile = sc.textFile(...) 
val df = textFile.map(_=>new A(2,3)).toDF()

(Edit: the error happens when I call df.collect() or register as table)

I get org.apache.spark.SparkException: Task not serializable

what am I missing?

I've tried adding encoders:

implicit def AEncoder: org.apache.spark.sql.Encoder[A] = 
  org.apache.spark.sql.Encoders.kryo[A]

and

import spark.implicits._
import org.apache.spark.sql.Encoders

edit: I have also tried:

val df = textFile.map(_=>new A(2,3)).collect()

but no luck so far.

Ged · Accepted Answer

Sometimes this occurs intermittently on DataBricks. Most annoying.

Restart the cluster and try again, I have had this error sometimes and after restart it did not occur.

Answers (2)