touchaponk
touchaponk

Reputation: 404

Spark Streaming StreamingContext error

I'm a Java veteran who's trying to learn Scala + Spark Streaming. I downloaded Eclipse-based Scala IDE + Spark core jar + Spark Streaming jar both 2.10 and try out the example - I'm getting the error:

val ssc = new StreamingContext(conf, Seconds(1));

Description Resource Path Location Type bad symbolic reference. A signature in StreamingContext.class refers to term conf in package org.apache.hadoop which is not available. It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version used when compiling StreamingContext.class. Lab.scala /AirStream/src line 10 Scala Problem

Is there something that I missed here? all SparkContext has no error but StreamingContext is getting this error al the time.

Upvotes: 1

Views: 7186

Answers (2)

Jeremy
Jeremy

Reputation: 587

I ran into approximately this same issue. Here is generally the scala class that I was writing for scala/spark practice:

package practice.spark

import org.apache.spark.SparkContext._
import org.apache.spark._
import org.apache.spark.sql._

object SparkService {
  def sparkInit(sparkInstanceConfig: Configuration): SparkService = {
    val sparkConf = new SparkConf().setAppName(sparkInstanceConfig.appName)
    val instanceSpark = new SparkService(sparkConf)
    return instanceSpark
  }
}

class SparkService(sparkConf: SparkConf) {
  val sc = new SparkContext(sparkConf)
  val sql = new org.apache.spark.sql.SQLContext(sc)
}

In my eclipse project properties>Java Build Path>Libraries I had jre8 library, scala 2.11 library, spark-core_2.11, and spark-sql_2.11. I was getting the error

Description Resource Path Location Type missing or invalid dependency detected while loading class file 'SparkContext.class'. Could not access term hadoop in package org.apache, because it (or its dependencies) are missing. Check your build definition for missing or conflicting dependencies. (Re-run with -Ylog-classpath to see the problematic classpath.) A full rebuild may help if 'SparkContext.class' was compiled against an incompatible version of org.apache. BinAnalysisNew Unknown Scala Problem

I then added the hadoop-core jar to my Java build path and it cleared up this issue. I used the latest version of that jar.

This issue may also be able to be cleared up by using gradle or some other build tool that will pick up all the dependencies of each jar used in the project.

Upvotes: 5

lmm
lmm

Reputation: 17431

Make sure the version of hadoop on the classpath matches the one that the spark streaming jar was built against. There might also be some dependencies that spark streaming expects to be provided by the cluster environment; if so you will need to add them manually to the classpath when running in eclipse.

Upvotes: 1

Related Questions