Guforu
Guforu

Reputation: 4023

sbt file doesn't recognize the spark input

I try to execute Scala code in Spark. The example of the code and build.sbt file is possible to find here.

I have one difference to this example. I use already the version 2.0.0 of Spark (I have already download this version local and defined path in .bashrc file). Now, I have modified also my build.sbt file and set the version to 2.0.0 After that I have the error message.

Case 1: I just executed the code of SparMeApp like is given in the link. I got the error message, that I have to define setMaster function.

16/09/05 19:37:01 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: A master URL must be set in your configuration

Case 2: I define setMaster function with different arguments. I have got next error messages:

Input: setMaster("spark://<username>:7077) or setMaster("local[2]") Error:

[error] (run-main-0) java.lang.ArrayIndexOutOfBoundsException: 0
java.lang.ArrayIndexOutOfBoundsException: 0

(this error means that my string is empty)

In other cases just error: 16/09/05 19:44:29 WARN

StandaloneAppClient$ClientEndpoint: Failed to connect to master <...>
org.apache.spark.SparkException: Exception thrown in awaitResult

Additional I have only a little experience in Scala and in sbt. So probably my sbt is configutred false.... Can somebody, please, tell me the right way?

Upvotes: 1

Views: 498

Answers (2)

Guforu
Guforu

Reputation: 4023

@Abhi, thank you very much for your answer. In general it works. Anyway I have some error message after the correct execution of the code. I have created some test txt file with 4 lines

test file
test file
test file
test file

In the SparkMeApp I have changed the code line to:

val fileName = "/home/usr/test.txt"

After I execute the line run SparkMeApp.scala I got next output:

16/09/06 09:15:34 INFO DAGScheduler: Job 0 finished: count at SparkMeApp.scala:11, took 0.348171 s
There are 4 lines in /home/usr/test.txt
16/09/06 09:15:34 ERROR ContextCleaner: Error in cleaning thread
java.lang.InterruptedException
    at java.lang.Object.wait(Native Method)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
    at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:175)
    at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1229)
    at org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:172)
    at org.apache.spark.ContextCleaner$$anon$1.run(ContextCleaner.scala:67)
16/09/06 09:15:34 ERROR Utils: uncaught error in thread SparkListenerBus, stopping SparkContext
java.lang.InterruptedException
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
    at java.util.concurrent.Semaphore.acquire(Semaphore.java:312)
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:67)
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:66)
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:66)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:65)
    at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1229)
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:64)
16/09/06 09:15:34 INFO SparkUI: Stopped Spark web UI at http://<myip>:4040
[success] Total time: 7 s, completed Sep 6, 2016 9:15:34 AM

I can see the correct output of my code (second line), but after I have got the interrupt error. How I can fix it? Anyway, I do hope the code is worked currently.

Upvotes: 0

Abhi
Abhi

Reputation: 366

This is how your minimal build.sbt will look :

name := "SparkMe Project"

version := "1.0"

scalaVersion := "2.11.7"

organization := "pl.japila"

libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0"

And here is your SparkMeApp object :

object SparkMeApp{
  def main(args: Array[String]) {
    val conf = new SparkConf()
      .setAppName("SparkMe Application")
    .setMaster("local[*]")
    val sc = new SparkContext(conf)

    val fileName = args(0)
    val lines = sc.textFile(fileName).cache

    val c = lines.count
    println(s"There are $c lines in $fileName")
  }
}

execute it like :

$ sbt "run [your file path]"

Upvotes: 1

Related Questions