blue-sky
blue-sky

Reputation: 53916

Monitoring a task in apache Spark

I start spark master using : ./sbin/start-master.sh as described at : http://spark.apache.org/docs/latest/spark-standalone.html

I then submit the Spark job :

sh ./bin/spark-submit \
  --class simplespark.Driver \
  --master spark://`localhost`:7077 \
    C:\\Users\\Adrian\\workspace\\simplespark\\target\\simplespark-0.0.1-SNAPSHOT.jar

How can run a simple app which demonstrates a parallel task running ?

When I view http://localhost:4040/executors/ & http://localhost:8080/ there are no tasks running :

enter image description here

enter image description here

The .jar I'm running (simplespark-0.0.1-SNAPSHOT.jar) just contains a single Scala object :

package simplespark

    import org.apache.spark.SparkContext

    object Driver {

      def main(args: Array[String]) {

        val conf = new org.apache.spark.SparkConf()
          .setMaster("local")
          .setAppName("knn")
          .setSparkHome("C:\\spark-1.1.0-bin-hadoop2.4\\spark-1.1.0-bin-hadoop2.4")
          .set("spark.executor.memory", "2g");

        val sc = new SparkContext(conf);
        val l = List(1)

        sc.parallelize(l)

        while(true){}

      }
    }

Update : When I change --master spark://localhost:7077 \ to --master spark://Adrian-PC:7077 \

I can see update on the Spark UI :

enter image description here

I have also updated Driver.scala to read default context, as I'm not sure if I set it correctly for submitting Spark jobs :

package simplespark

import org.apache.spark.SparkContext

object Driver {

  def main(args: Array[String]) {

    System.setProperty("spark.executor.memory", "2g")

    val sc = new SparkContext();
    val l = List(1)

    val c = sc.parallelize(List(2, 3, 5, 7)).count()
    println(c)

    sc.stop

  }
}

On Spark console I receive multiple same all same messages :

14/12/26 20:08:32 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

So it appears that the Spark job is not reaching the master ?

Update2 : After I start (thanks to Lomig Mégard comment below) the worker using :

./bin/spark-class org.apache.spark.deploy.worker.Worker spark://Adrian-PC:7077 

I receive error :

14/12/27 21:23:52 INFO SparkDeploySchedulerBackend: Executor app-20141227212351-0003/8 removed: java.io.IOException: Cannot run program "C:\cygdrive\c\spark-1.1.0-bin-hadoop2.4\spark-1.1.0-bin-hadoop2.4/bin/compute-classpath.cmd" (in directory "."): CreateProcess error=2, The system cannot find the file specified
14/12/27 21:23:52 INFO AppClient$ClientActor: Executor added: app-20141227212351-0003/9 on worker-20141227211411-Adrian-PC-58199 (Adrian-PC:58199) with 4 cores
14/12/27 21:23:52 INFO SparkDeploySchedulerBackend: Granted executor ID app-20141227212351-0003/9 on hostPort Adrian-PC:58199 with 4 cores, 2.0 GB RAM
14/12/27 21:23:52 INFO AppClient$ClientActor: Executor updated: app-20141227212351-0003/9 is now RUNNING
14/12/27 21:23:52 INFO AppClient$ClientActor: Executor updated: app-20141227212351-0003/9 is now FAILED (java.io.IOException: Cannot run program "C:\cygdrive\c\spark-1.1.0-bin-hadoop2.4\spark-1.1.0-bin-hadoop2.4/bin/compute-classpath.cmd" (in directory "."): CreateProcess error=2, The system cannot find the file specified)
14/12/27 21:23:52 INFO SparkDeploySchedulerBackend: Executor app-20141227212351-0003/9 removed: java.io.IOException: Cannot run program "C:\cygdrive\c\spark-1.1.0-bin-hadoop2.4\spark-1.1.0-bin-hadoop2.4/bin/compute-classpath.cmd" (in directory "."): CreateProcess error=2, The system cannot find the file specified
14/12/27 21:23:52 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: Master removed our application: FAILED
14/12/27 21:23:52 ERROR TaskSchedulerImpl: Exiting due to error from cluster scheduler: Master removed our application: FAILED
14/12/27 21:23:52 INFO DAGScheduler: Submitting 2 missing tasks from Stage 0 (ParallelCollectionRDD[0] at parallelize at Driver.scala:14)
14/12/27 21:23:52 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
Java HotSpot(TM) Client VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0

I'm running the scripts on Windows using Cygwin. To fix this error I copy the Spark installation to cygwin C:\ drive. But then I receive a new error :

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Master removed our application: FAILED
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
        at scala.Option.foreach(Option.scala:236)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
        at akka.actor.ActorCell.invoke(ActorCell.scala:456)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
        at akka.dispatch.Mailbox.run(Mailbox.scala:219)
        at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Java HotSpot(TM) Client VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0

Upvotes: 1

Views: 3568

Answers (1)

Lomig Mégard
Lomig Mégard

Reputation: 1838

You have to start the actual computation to see the job.

val c = sc.parallelize(List(2, 3, 5, 7)).count()
println(c)

Here count is called an action, you need at least one of them to begin a job. You can find the list of available actions in the Spark doc.

The other methods are called transformations. They are lazily executed.

Don't forget to stop the context at the end, instead of your infinite loop, with sc.stop().

Edit: For the updated question, you allocate more memory to the executor than there is available in the worker. The defaults should be fine for simple tests.

You also need to have a running worker linked to your master. See this doc to start it.

./sbin/start-master.sh
./bin/spark-class org.apache.spark.deploy.worker.Worker spark://IP:PORT

Upvotes: 1

Related Questions