Reputation: 4023
I try to execute Scala code in Spark. The example of the code and build.sbt
file is possible to find here.
I have one difference to this example. I use already the version 2.0.0
of Spark (I have already download this version local and defined path in .bashrc file). Now, I have modified also my build.sbt
file and set the version to 2.0.0
After that I have the error message.
Case 1:
I just executed the code of SparMeApp
like is given in the link. I got the error message, that I have to define setMaster
function.
16/09/05 19:37:01 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: A master URL must be set in your configuration
Case 2:
I define setMaster
function with different arguments. I have got next error messages:
Input: setMaster("spark://<username>:7077)
or setMaster("local[2]")
Error:
[error] (run-main-0) java.lang.ArrayIndexOutOfBoundsException: 0
java.lang.ArrayIndexOutOfBoundsException: 0
(this error means that my string is empty)
In other cases just error: 16/09/05 19:44:29 WARN
StandaloneAppClient$ClientEndpoint: Failed to connect to master <...>
org.apache.spark.SparkException: Exception thrown in awaitResult
Additional I have only a little experience in Scala and in sbt. So probably my sbt is configutred false.... Can somebody, please, tell me the right way?
Upvotes: 1
Views: 498
Reputation: 4023
@Abhi, thank you very much for your answer. In general it works. Anyway I have some error message after the correct execution of the code. I have created some test txt file with 4 lines
test file
test file
test file
test file
In the SparkMeApp I have changed the code line to:
val fileName = "/home/usr/test.txt"
After I execute the line run SparkMeApp.scala
I got next output:
16/09/06 09:15:34 INFO DAGScheduler: Job 0 finished: count at SparkMeApp.scala:11, took 0.348171 s
There are 4 lines in /home/usr/test.txt
16/09/06 09:15:34 ERROR ContextCleaner: Error in cleaning thread
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:175)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1229)
at org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:172)
at org.apache.spark.ContextCleaner$$anon$1.run(ContextCleaner.scala:67)
16/09/06 09:15:34 ERROR Utils: uncaught error in thread SparkListenerBus, stopping SparkContext
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
at java.util.concurrent.Semaphore.acquire(Semaphore.java:312)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:67)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:66)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:66)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:65)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1229)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:64)
16/09/06 09:15:34 INFO SparkUI: Stopped Spark web UI at http://<myip>:4040
[success] Total time: 7 s, completed Sep 6, 2016 9:15:34 AM
I can see the correct output of my code (second line), but after I have got the interrupt error. How I can fix it? Anyway, I do hope the code is worked currently.
Upvotes: 0
Reputation: 366
This is how your minimal build.sbt will look :
name := "SparkMe Project"
version := "1.0"
scalaVersion := "2.11.7"
organization := "pl.japila"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0"
And here is your SparkMeApp object :
object SparkMeApp{
def main(args: Array[String]) {
val conf = new SparkConf()
.setAppName("SparkMe Application")
.setMaster("local[*]")
val sc = new SparkContext(conf)
val fileName = args(0)
val lines = sc.textFile(fileName).cache
val c = lines.count
println(s"There are $c lines in $fileName")
}
}
execute it like :
$ sbt "run [your file path]"
Upvotes: 1