Banehallow
Banehallow

Reputation: 141

idea sbt java.lang.NoClassDefFoundError: org/apache/spark/SparkConf

I'm a beginner of spark.I build an environment use "linux + idea + sbt" ,when I try the quick start of Spark,I get the problem:

    Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/SparkConf
    at test$.main(test.scala:11)
    at test.main(test.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.SparkConf
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 7 more

The versions of them in my disk:

sbt   = 0.13.11
jdk   = 1.8
scala = 2.10
idea  = 2016

My directory structure:

test/
  idea/
  out/
  project/
    build.properties    
    plugins.sbt
  src/
    main/
      java/
      resources/
      scala/
      scala-2.10/
        test.scala
  target/
  assembly.sbt
  build.sbt

In build.properties:

sbt.version = 0.13.8

In plugins.sbt:

logLevel := Level.Warn

addSbtPlugin("com.github.mpeltonen" % "sbt-idea" % "1.6.0")

addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.11.2")

In build.sbt:

import sbt._
import Keys._
import sbtassembly.Plugin._
import AssemblyKeys._

name := "test"

version := "1.0"

scalaVersion := "2.10.4"

libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "1.6.1" % "provided"

In assembly.sbt:

import AssemblyKeys._ // put this at the top of the file

assemblySettings

In test.scala:

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf

object test {
  def main(args: Array[String]) {
    val logFile = "/opt/spark-1.6.1-bin-hadoop2.6/README.md" // Should be some file on your system
    val conf = new SparkConf().setAppName("Test Application")
    val sc = new SparkContext(conf)
    val logData = sc.textFile(logFile, 2).cache()
    val numAs = logData.filter(line => line.contains("a")).count()
    val numBs = logData.filter(line => line.contains("b")).count()
    println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
  }
}

How can I solve this problem.

Upvotes: 14

Views: 26615

Answers (3)

Jared
Jared

Reputation: 2954

In intelliJ version 2018.1 there is a checkbox in the run configuration called "Include dependencies with "Provided" scope". Checking this option solved it for me.

Upvotes: 19

user3485352
user3485352

Reputation: 119

I had same issue this morning with the error provided. I removed "provided" and ran sbt clean, reload, compile, package, run . I also test using spark-submit from command line. But I think "provided", the extra overhead on code, jar is less.

Upvotes: 6

Sergey
Sergey

Reputation: 2900

Dependencies with "provided" scope are only available during compilation and testing, and are not available at runtime or for packaging. So, instead of making an object test with a main, you should make it an actual test suite placed in src/test/scala (If you're not familiar with unit-testing in Scala, I'd suggest to use ScalaTest, for example. First add a dependency on it in your build.sbt: libraryDependencies += "org.scalatest" %% "scalatest" % "2.2.4" % Test and then go for this quick start tutorial to implement a simple spec).


Another option, which is quite hacky, in my opinion (but does the trick nonetheless), involves removing provided scope from your spark-core dependency in some configurations and is described in the accepted answer to this question.

Upvotes: 19

Related Questions