Alex
Alex

Reputation: 21

ClassNotFoundException when running Spark 'task' programatically in yarn-client master

I'm trying to integrate Spark with a web UI running in Play. The idea is to access the processing result files (format = parquet) in HDFS and use Spark's SparkContext and SQLContext to perform different tasks and then serve transformed data back to UI.

The cluster is running in the distributed mode, it is fully functional and has got following setup Hadoop 2.6 and Spark 1.5.1. The jobs submitted like:

spark-submit --master yarn --deploy-mode client --num-executors 25 --executor-cores 1 --executor-memory 1g --driver-memory 1g --class ...

produce expected results and don't cause any issues.

As part of Spark/Play integration there was defined simple test case to check the correctness of creation of SparkContext.

trait SparkOutBehaviour {
 this: FlatSpec =>

 def checkContext(): Unit = {
   it should "create spark context and do simple map" in {
    val conf = new SparkConf()
      .setAppName("DWUI")
      .setMaster("yarn-client")
      .setSparkHome("/workspace/spark")
      .set("spark.ui.enabled", "false")
      .set("spark.yarn.jar", "hdfs:///spark/lib/spark-assembly-1.5.1-hadoop2.6.0.jar")
      .set("spark.logConf", "true")

    val sc = new SparkContext(conf)
    val data = sc.parallelize(1 to 100)
    data.map(math sqrt _).collect().foreach(println)
  }
 }
}

To be able to execute this code following has beed added to build.sbt

unmanagedClasspath in Test += file("/workspace/hadoop/etc/hadoop")

Execution of this test is failing with following stack trace below.

Any help will be more than welcome, thank you in advance.

[info] Loading project definition from ...
[info] Set current project to dynaprix-batch (in build file:.../)
[info] Compiling 1 Scala source to .../target/scala-2.10/test-classes...
16/03/01 09:27:25 INFO spark.SparkContext: Running Spark version 1.5.1
16/03/01 09:27:36 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/03/01 09:27:36 INFO spark.SparkContext: Spark configuration:
spark.app.name=DWUI
spark.home=/workspace/spark
spark.logConf=true
spark.master=yarn-client
spark.ui.enabled=false
spark.yarn.jar=hdfs:///spark/lib/spark-assembly-1.5.1-hadoop2.6.0.jar
16/03/01 09:27:36 INFO spark.SecurityManager: Changing view acls to: ...
16/03/01 09:27:36 INFO spark.SecurityManager: Changing modify acls to: ...
16/03/01 09:27:36 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(...); users with modify permissions: Set(...)
16/03/01 09:27:37 INFO slf4j.Slf4jLogger: Slf4jLogger started
16/03/01 09:27:37 INFO Remoting: Starting remoting
16/03/01 09:27:37 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:59737]
16/03/01 09:27:37 INFO util.Utils: Successfully started service 'sparkDriver' on port 59737.
16/03/01 09:27:37 INFO spark.SparkEnv: Registering MapOutputTracker
16/03/01 09:27:37 INFO spark.SparkEnv: Registering BlockManagerMaster
16/03/01 09:27:37 INFO storage.DiskBlockManager: Created local directory at /private/var/folders/5d/2fjhh4m14c71q9hgtjf7nsnc0000gn/T/blockmgr-1b1bc4d2-c628-4e8e-909e-37e9c7fe6bb7
16/03/01 09:27:37 INFO storage.MemoryStore: MemoryStore started with capacity 530.3 MB
16/03/01 09:27:37 INFO spark.HttpFileServer: HTTP File server directory is /private/var/folders/5d/2fjhh4m14c71q9hgtjf7nsnc0000gn/T/spark-cec72be8-ed71-463d-9b75-9cf7163b967a/httpd-2610e135-d54c-445f-b0a9-65a2d9841946
16/03/01 09:27:37 INFO spark.HttpServer: Starting HTTP Server
16/03/01 09:27:37 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/03/01 09:27:37 INFO server.AbstractConnector: Started [email protected]:59738
16/03/01 09:27:37 INFO util.Utils: Successfully started service 'HTTP file server' on port 59738.
16/03/01 09:27:37 INFO spark.SparkEnv: Registering OutputCommitCoordinator
16/03/01 09:27:37 WARN metrics.MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
16/03/01 09:27:37 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8032
16/03/01 09:27:38 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers
16/03/01 09:27:38 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
16/03/01 09:27:38 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
16/03/01 09:27:38 INFO yarn.Client: Setting up container launch context for our AM
16/03/01 09:27:38 INFO yarn.Client: Setting up the launch environment for our AM container
16/03/01 09:27:38 INFO yarn.Client: Preparing resources for our AM container
16/03/01 09:27:38 INFO yarn.Client: Source and destination file systems are the same. Not copying hdfs:/spark/lib/spark-assembly-1.5.1-hadoop2.6.0.jar
16/03/01 09:27:38 INFO yarn.Client: Uploading resource file:/private/var/folders/5d/2fjhh4m14c71q9hgtjf7nsnc0000gn/T/spark-cec72be8-ed71-463d-9b75-9cf7163b967a/__spark_conf__4520540564284458235.zip -> hdfs://localhost:9000/user/.../.sparkStaging/application_1456736333878_0035/__spark_conf__4520540564284458235.zip
16/03/01 09:27:38 INFO spark.SecurityManager: Changing view acls to: ...
16/03/01 09:27:38 INFO spark.SecurityManager: Changing modify acls to: ...
16/03/01 09:27:38 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(...); users with modify permissions: Set(...)
16/03/01 09:27:38 INFO yarn.Client: Submitting application 35 to ResourceManager
16/03/01 09:27:38 INFO impl.YarnClientImpl: Submitted application application_1456736333878_0035
16/03/01 09:27:39 INFO yarn.Client: Application report for application_1456736333878_0035 (state: ACCEPTED)
16/03/01 09:27:39 INFO yarn.Client: 
     client token: N/A
     diagnostics: N/A
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1456820858863
     final status: UNDEFINED
     tracking URL: http://10.0.1.12:8088/proxy/application_1456736333878_0035/
     user: ...
16/03/01 09:27:40 INFO yarn.Client: Application report for application_1456736333878_0035 (state: ACCEPTED)
16/03/01 09:27:41 INFO yarn.Client: Application report for application_1456736333878_0035 (state: ACCEPTED)
16/03/01 09:27:42 INFO yarn.Client: Application report for application_1456736333878_0035 (state: ACCEPTED)
16/03/01 09:27:43 INFO yarn.Client: Application report for application_1456736333878_0035 (state: ACCEPTED)
16/03/01 09:27:44 INFO yarn.Client: Application report for application_1456736333878_0035 (state: ACCEPTED)
16/03/01 09:27:45 INFO yarn.Client: Application report for application_1456736333878_0035 (state: ACCEPTED)
16/03/01 09:27:46 INFO yarn.Client: Application report for application_1456736333878_0035 (state: ACCEPTED)
16/03/01 09:27:47 INFO yarn.Client: Application report for application_1456736333878_0035 (state: ACCEPTED)
16/03/01 09:27:48 INFO yarn.Client: Application report for application_1456736333878_0035 (state: ACCEPTED)
16/03/01 09:27:49 INFO yarn.Client: Application report for application_1456736333878_0035 (state: ACCEPTED)
16/03/01 09:27:50 INFO yarn.Client: Application report for application_1456736333878_0035 (state: ACCEPTED)
16/03/01 09:27:51 INFO yarn.Client: Application report for application_1456736333878_0035 (state: ACCEPTED)
16/03/01 09:27:52 INFO yarn.Client: Application report for application_1456736333878_0035 (state: ACCEPTED)
16/03/01 09:27:53 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as AkkaRpcEndpointRef(Actor[akka.tcp://[email protected]:59748/user/YarnAM#1326936326])
16/03/01 09:27:53 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> 10.0.1.12, PROXY_URI_BASES -> http://10.0.1.12:8088/proxy/application_1456736333878_0035), /proxy/application_1456736333878_0035
16/03/01 09:27:53 INFO yarn.Client: Application report for application_1456736333878_0035 (state: RUNNING)
16/03/01 09:27:53 INFO yarn.Client: 
     client token: N/A
     diagnostics: N/A
     ApplicationMaster host: 10.0.1.12
     ApplicationMaster RPC port: 0
     queue: default
     start time: 1456820858863
     final status: UNDEFINED
     tracking URL: http://10.0.1.12:8088/proxy/application_1456736333878_0035/
     user: ...
16/03/01 09:27:53 INFO cluster.YarnClientSchedulerBackend: Application application_1456736333878_0035 has started running.
16/03/01 09:27:54 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 59752.
16/03/01 09:27:54 INFO netty.NettyBlockTransferService: Server created on 59752
16/03/01 09:27:54 INFO storage.BlockManagerMaster: Trying to register BlockManager
16/03/01 09:27:54 INFO storage.BlockManagerMasterEndpoint: Registering block manager 10.0.1.12:59752 with 530.3 MB RAM, BlockManagerId(driver, 10.0.1.12, 59752)
16/03/01 09:27:54 INFO storage.BlockManagerMaster: Registered BlockManager
16/03/01 09:28:07 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
16/03/01 09:28:08 INFO spark.SparkContext: Starting job: collect at SparkOutSpec.scala:37
16/03/01 09:28:08 INFO scheduler.DAGScheduler: Got job 0 (collect at SparkOutSpec.scala:37) with 2 output partitions
16/03/01 09:28:08 INFO scheduler.DAGScheduler: Final stage: ResultStage 0(collect at SparkOutSpec.scala:37)
16/03/01 09:28:08 INFO scheduler.DAGScheduler: Parents of final stage: List()
16/03/01 09:28:08 INFO scheduler.DAGScheduler: Missing parents: List()
16/03/01 09:28:08 INFO scheduler.DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkOutSpec.scala:37), which has no missing parents
16/03/01 09:28:08 INFO storage.MemoryStore: ensureFreeSpace(2056) called with curMem=0, maxMem=556038881
16/03/01 09:28:08 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 2.0 KB, free 530.3 MB)
16/03/01 09:28:08 INFO storage.MemoryStore: ensureFreeSpace(1301) called with curMem=2056, maxMem=556038881
16/03/01 09:28:08 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1301.0 B, free 530.3 MB)
16/03/01 09:28:08 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.0.1.12:59752 (size: 1301.0 B, free: 530.3 MB)
16/03/01 09:28:08 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:861
16/03/01 09:28:08 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkOutSpec.scala:37)
16/03/01 09:28:08 INFO cluster.YarnScheduler: Adding task set 0.0 with 2 tasks
16/03/01 09:28:08 INFO cluster.YarnClientSchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@localhost:59762/user/Executor#-1228759911]) with ID 1
16/03/01 09:28:08 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, PROCESS_LOCAL, 2085 bytes)
16/03/01 09:28:08 INFO cluster.YarnClientSchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@localhost:59760/user/Executor#2117534661]) with ID 2
16/03/01 09:28:08 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, PROCESS_LOCAL, 2142 bytes)
16/03/01 09:28:08 INFO storage.BlockManagerMasterEndpoint: Registering block manager localhost:59764 with 530.3 MB RAM, BlockManagerId(1, localhost, 59764)
16/03/01 09:28:08 INFO storage.BlockManagerMasterEndpoint: Registering block manager localhost:59765 with 530.3 MB RAM, BlockManagerId(2, localhost, 59765)
16/03/01 09:28:08 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:59764 (size: 1301.0 B, free: 530.3 MB)
16/03/01 09:28:08 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:59765 (size: 1301.0 B, free: 530.3 MB)
16/03/01 09:28:09 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.ClassNotFoundException: this.is.myclass.SparkOutBehaviour$$anonfun$checkContext$1$$anonfun$apply$mcV$sp$1
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:274)
    at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
    at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
    at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
    at org.apache.spark.scheduler.Task.run(Task.scala:88)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

16/03/01 09:28:09 INFO scheduler.TaskSetManager: Starting task 0.1 in stage 0.0 (TID 2, localhost, PROCESS_LOCAL, 2085 bytes)
16/03/01 09:28:09 INFO scheduler.TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1) on executor localhost: java.lang.ClassNotFoundException (this.is.myclass.SparkOutBehaviour$$anonfun$checkContext$1$$anonfun$apply$mcV$sp$1) [duplicate 1]
16/03/01 09:28:09 INFO scheduler.TaskSetManager: Starting task 1.1 in stage 0.0 (TID 3, localhost, PROCESS_LOCAL, 2142 bytes)
16/03/01 09:28:09 INFO scheduler.TaskSetManager: Lost task 0.1 in stage 0.0 (TID 2) on executor localhost: java.lang.ClassNotFoundException (this.is.myclass.SparkOutBehaviour$$anonfun$checkContext$1$$anonfun$apply$mcV$sp$1) [duplicate 2]
16/03/01 09:28:09 INFO scheduler.TaskSetManager: Starting task 0.2 in stage 0.0 (TID 4, localhost, PROCESS_LOCAL, 2085 bytes)
16/03/01 09:28:09 INFO scheduler.TaskSetManager: Lost task 1.1 in stage 0.0 (TID 3) on executor localhost: java.lang.ClassNotFoundException (this.is.myclass.SparkOutBehaviour$$anonfun$checkContext$1$$anonfun$apply$mcV$sp$1) [duplicate 3]
16/03/01 09:28:09 INFO scheduler.TaskSetManager: Starting task 1.2 in stage 0.0 (TID 5, localhost, PROCESS_LOCAL, 2142 bytes)
16/03/01 09:28:09 INFO scheduler.TaskSetManager: Lost task 0.2 in stage 0.0 (TID 4) on executor localhost: java.lang.ClassNotFoundException (this.is.myclass.SparkOutBehaviour$$anonfun$checkContext$1$$anonfun$apply$mcV$sp$1) [duplicate 4]
16/03/01 09:28:09 INFO scheduler.TaskSetManager: Starting task 0.3 in stage 0.0 (TID 6, localhost, PROCESS_LOCAL, 2085 bytes)
16/03/01 09:28:09 INFO scheduler.TaskSetManager: Lost task 1.2 in stage 0.0 (TID 5) on executor localhost: java.lang.ClassNotFoundException (this.is.myclass.SparkOutBehaviour$$anonfun$checkContext$1$$anonfun$apply$mcV$sp$1) [duplicate 5]
16/03/01 09:28:09 INFO scheduler.TaskSetManager: Starting task 1.3 in stage 0.0 (TID 7, localhost, PROCESS_LOCAL, 2142 bytes)
16/03/01 09:28:09 INFO scheduler.TaskSetManager: Lost task 0.3 in stage 0.0 (TID 6) on executor localhost: java.lang.ClassNotFoundException (this.is.myclass.SparkOutBehaviour$$anonfun$checkContext$1$$anonfun$apply$mcV$sp$1) [duplicate 6]
16/03/01 09:28:09 ERROR scheduler.TaskSetManager: Task 0 in stage 0.0 failed 4 times; aborting job
16/03/01 09:28:09 INFO cluster.YarnScheduler: Cancelling stage 0
16/03/01 09:28:09 INFO cluster.YarnScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool 
16/03/01 09:28:09 INFO cluster.YarnScheduler: Stage 0 was cancelled
16/03/01 09:28:09 INFO scheduler.TaskSetManager: Lost task 1.3 in stage 0.0 (TID 7) on executor localhost: java.lang.ClassNotFoundException (this.is.myclass.SparkOutBehaviour$$anonfun$checkContext$1$$anonfun$apply$mcV$sp$1) [duplicate 7]
16/03/01 09:28:09 INFO cluster.YarnScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool 
16/03/01 09:28:09 INFO scheduler.DAGScheduler: ResultStage 0 (collect at SparkOutSpec.scala:37) failed in 0.601 s
16/03/01 09:28:09 INFO scheduler.DAGScheduler: Job 0 failed: collect at SparkOutSpec.scala:37, took 0.780054 s
[info] SparkOutSpec:
[info] bootstrap test
[info] - should create spark context and do simple map *** FAILED ***
[info]   org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 6, localhost): java.lang.ClassNotFoundException: this.is.myclass.SparkOutBehaviour$$anonfun$checkContext$1$$anonfun$apply$mcV$sp$1
[info]  at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
[info]  at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
[info]  at java.security.AccessController.doPrivileged(Native Method)
[info]  at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
[info]  at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
[info]  at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
[info]  at java.lang.Class.forName0(Native Method)
[info]  at java.lang.Class.forName(Class.java:274)
[info]  at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
[info]  at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
[info]  at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
[info]  at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
[info]  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
[info]  at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
[info]  at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
[info]  at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
[info]  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
[info]  at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
[info]  at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
[info]  at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
[info]  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
[info]  at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
[info]  at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
[info]  at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
[info]  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
[info]  at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
[info]  at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
[info]  at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
[info]  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
[info]  at org.apache.spark.scheduler.Task.run(Task.scala:88)
[info]  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
[info]  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[info]  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[info]  at java.lang.Thread.run(Thread.java:745)
[info] 
[info] Driver stacktrace:
[info]   at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283)
[info]   at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1271)
[info]   at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1270)
[info]   at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
[info]   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
[info]   at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1270)
[info]   at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
[info]   at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
[info]   at scala.Option.foreach(Option.scala:236)
[info]   at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
[info]   ...
[info]   Cause: java.lang.ClassNotFoundException: this.is.myclass.SparkOutBehaviour$$anonfun$checkContext$1$$anonfun$apply$mcV$sp$1
[info]   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
[info]   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
[info]   at java.security.AccessController.doPrivileged(Native Method)
[info]   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
[info]   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
[info]   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
[info]   at java.lang.Class.forName0(Native Method)
[info]   at java.lang.Class.forName(Class.java:274)
[info]   at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
[info]   at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
[info]   ...
[info] ScalaTest
[info] Run completed in 44 seconds, 343 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 0, failed 1, canceled 0, ignored 0, pending 0
[info] *** 1 TEST FAILED ***
[error] Failed: Total 1, Failed 1, Errors 0, Passed 0
[error] Failed tests:
[error]     this.is.myclass.SparkOutSpec
[error] (test:testOnly) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 50 s, completed Mar 1, 2016 9:28:09 AM
16/03/01 09:28:09 INFO spark.SparkContext: Invoking stop() from shutdown hook
16/03/01 09:28:09 INFO scheduler.DAGScheduler: Stopping DAGScheduler
16/03/01 09:28:09 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors
16/03/01 09:28:09 INFO cluster.YarnClientSchedulerBackend: Interrupting monitor thread
16/03/01 09:28:09 INFO cluster.YarnClientSchedulerBackend: Asking each executor to shut down
16/03/01 09:28:09 INFO cluster.YarnClientSchedulerBackend: Stopped
16/03/01 09:28:09 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/03/01 09:28:09 INFO storage.MemoryStore: MemoryStore cleared
16/03/01 09:28:09 INFO storage.BlockManager: BlockManager stopped
16/03/01 09:28:09 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
16/03/01 09:28:09 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/03/01 09:28:09 INFO spark.SparkContext: Successfully stopped SparkContext
16/03/01 09:28:09 INFO util.ShutdownHookManager: Shutdown hook called
16/03/01 09:28:09 INFO util.ShutdownHookManager: Deleting directory /private/var/folders/5d/2fjhh4m14c71q9hgtjf7nsnc0000gn/T/spark-cec72be8-ed71-463d-9b75-9cf7163b967a

Upvotes: 2

Views: 849

Answers (1)

innovatism
innovatism

Reputation: 389

You should make your class serializable. The NullpointerException is on the worker because it doesn't have your class passed by.

Upvotes: 0

Related Questions