Reputation: 307
I use Spark 2.1.1 with Scala 2.11.8.
Inside spark-shell
I use :load
command to load a class that has methods with RDDs.
When I try to load the class I get the following compilation error:
error: not found: type RDD
Why? I've got the import statement.
This is the code I'm working with
Upvotes: 3
Views: 3132
Reputation: 621
This is a bug in spark-shell, please refer to https://issues.apache.org/jira/browse/SPARK-22393 and was fixed in spark 2.3.0. Please use spark 2.3.0(or later) or use the method suggested by @Jacek Laskowski
Upvotes: 1
Reputation: 74619
That seems a feature of :load
in spark-shell
. A solution is to move import org.apache.spark.rdd.RDD
(no dot and underscore) to your class definition.
This seems not specific to the RDD
class but any classes imported. It won't work unless the import
statement is defined inside the class itself.
With that said, the following won't work due to import being outside the class.
import org.apache.spark.rdd.RDD
class Hello {
def get(rdd: RDD[String]): RDD[String] = rdd
}
scala> :load hello.scala
Loading hello.scala...
import org.apache.spark.rdd.RDD
<console>:12: error: not found: type RDD
def get(rdd: RDD[String]): RDD[String] = rdd
^
<console>:12: error: not found: type RDD
def get(rdd: RDD[String]): RDD[String] = rdd
^
You can see what happens under the covers using -v
flag of :load
.
scala> :load -v hello.scala
Loading hello.scala...
scala>
scala> import org.apache.spark.rdd.RDD
import org.apache.spark.rdd.RDD
scala> class Hello {
| def get(rdd: RDD[String]): RDD[String] = rdd
| }
<console>:12: error: not found: type RDD
def get(rdd: RDD[String]): RDD[String] = rdd
^
<console>:12: error: not found: type RDD
def get(rdd: RDD[String]): RDD[String] = rdd
^
That led me to guess that having the import inside the class definition could help. And it did! (to my great surprise)
class Hello {
import org.apache.spark.rdd.RDD
def get(rdd: RDD[String]): RDD[String] = rdd
}
scala> :load -v hello.scala
Loading hello.scala...
scala> class Hello {
| import org.apache.spark.rdd.RDD
| def get(rdd: RDD[String]): RDD[String] = rdd
| }
defined class Hello
You could also use :paste
command to paste the class to spark-shell
. There's the so-called raw mode when you could define classes in their own package.
package mypackage
class Hello {
import org.apache.spark.rdd.RDD
def get(rdd: RDD[String]): RDD[String] = rdd
}
scala> :load -v hello.scala
Loading hello.scala...
scala> package mypackage
<console>:1: error: illegal start of definition
package mypackage
^
scala>
scala> class Hello {
| import org.apache.spark.rdd.RDD
| def get(rdd: RDD[String]): RDD[String] = rdd
| }
defined class Hello
scala> :paste -raw
// Entering paste mode (ctrl-D to finish)
package mypackage
class Hello {
import org.apache.spark.rdd.RDD
def get(rdd: RDD[String]): RDD[String] = rdd
}
// Exiting paste mode, now interpreting.
Upvotes: 7