Ian Campbell
Ian Campbell

Reputation: 2748

Scala IDE and Apache Spark -- different scala library version found in the build path


I have some main object:

import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._

object Main {

  def main(args: Array[String]) {
        val sc = new SparkContext(
              new SparkConf().setMaster("local").setAppName("FakeProjectName")
        )
  }
}


...then I add spark-assembly-1.3.0-hadoop2.4.0.jar to the build path in Eclipse from

Project > Properties... > Java Build Path :

Java Build Path error

...and this warning appears in the Eclipse console:

More than one scala library found in the build path
(C:/Program Files/Eclipse/Indigo 3.7.2/configuration/org.eclipse.osgi/bundles/246/1/.cp/lib/scala-library.jar,
C:/spark/lib/spark-assembly-1.3.0-hadoop2.4.0.jar).
This is not an optimal configuration, try to limit to one Scala library in the build path.
FakeProjectName Unknown Scala Classpath Problem


Then I remove Scala Library [2.10.2] from the build path, and it still works. Except now this warning appears in the Eclipse console:

The version of scala library found in the build path is different from the one provided by scala IDE:
2.10.4. Expected: 2.10.2. Make sure you know what you are doing.
FakeProjectName Unknown Scala Classpath Problem


Is this a non-issue? Either way, how do I fix it?

Upvotes: 4

Views: 9095

Answers (3)

zarpetkov
zarpetkov

Reputation: 9

There are 2 types of Spark JAR files (just by looking at the Name):

- Name includes the word "assembly" and not "core" (has Scala inside)

- Name includes the word "core" and not "assembly" (no Scala inside). 

You should include the "core" type in your Build Path via “Add External Jars” (the version you need) since the Scala IDE already shoves one Scala for you.

Alternatively, you can just take advantage of the SBT and add the following Dependency (again, pay attention to the versions you need):

libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.1.0"

Then you should NOT include “forcefully” any spark JAR in the Build Path.

Happy sparking:

Zar

>

Upvotes: 0

Eloy Gonzales
Eloy Gonzales

Reputation: 41

The easiest solution:

In Eclipse : 1. Project/ (righclick) Properties 2. Go to Scala Compiler 3. click Use Project Settings 4. set Scala Installation to a compatible version. Generally Fixed Scala Installation 2.XX.X (build-in) 5. Rebuild the project.

Upvotes: 4

Stefan Sigurdsson
Stefan Sigurdsson

Reputation: 221

This is often a non-issue, especially when the version difference is small, but there are no guarantees...

The problem is (as stated in the warning) that your project has two Scala libraries on the class path. One is explicitly configured as part of the project; this is version 2.10.2 and is shipped with the Scala IDE plugins. The other copy has version 2.10.4 and is included in the Spark jar.

One way to fix the problem is to install a different version of Scala IDE, that ships with 2.10.4. But this is not ideal. As noted here, Scala IDE requires every project to use the same library version:

http://scala-ide.org/docs/current-user-doc/gettingstarted/index.html#choosing-what-version-to-install

A better solution is to clean up the class path by replacing the Spark jar you are using. The one you have is an assembly jar, which means it includes every dependency used in the build that produced it. If you are using sbt or Maven, then you can remove the assembly jar and simply add Spark 1.3.0 and Hadoop 2.4.0 as dependencies of your project. Every other dependency will be pulled in during your build. If you're not using sbt or Maven yet, then perhaps give sbt a spin - it is really easy to set up a build.sbt file with a couple of library dependencies, and sbt has a degree of support for specifying which library version to use.

Upvotes: 5

Related Questions