Reputation: 853
I have created spark streaming application, which worked fine when deploy mode was client.
On my virtual machine I have master and only one worker.
When I tried to change mode to "cluster" it fails. In web UI, I see that the driver is running, but application is failed.
EDITED
In the log, I see following content:
16/03/23 09:06:25 INFO Master: Driver submitted org.apache.spark.deploy.worker.DriverWrapper
16/03/23 09:06:25 INFO Master: Launching driver driver-20160323090625-0001 on worker worker-20160323085541-10.0.2.15-36648
16/03/23 09:06:32 INFO Master: metering.dev.enerbyte.com:37168 got disassociated, removing it.
16/03/23 09:06:32 INFO Master: 10.0.2.15:59942 got disassociated, removing it.
16/03/23 09:06:32 INFO Master: metering.dev.enerbyte.com:37166 got disassociated, removing it.
16/03/23 09:06:46 INFO Master: Registering app wibeee-pipeline
16/03/23 09:06:46 INFO Master: Registered app wibeee-pipeline with ID app-20160323090646-0007
16/03/23 09:06:46 INFO Master: Launching executor app-20160323090646-0007/0 on worker worker-20160323085541-10.0.2.15-36648
16/03/23 09:06:50 INFO Master: Received unregister request from application app-20160323090646-0007
16/03/23 09:06:50 INFO Master: Removing app app-20160323090646-0007
16/03/23 09:06:50 WARN Master: Got status update for unknown executor app-20160323090646-0007/0
16/03/23 09:06:50 INFO Master: metering.dev.enerbyte.com:37172 got disassociated, removing it.
16/03/23 09:06:50 INFO Master: 10.0.2.15:45079 got disassociated, removing it.
16/03/23 09:06:51 INFO Master: Removing driver: driver-20160323090625-0001
So what happens is that master launches the driver on the worker,application gets registered, and then executir is tried to be launched on the same worker, which fails (although I have only one worker!)
EDIT Can the issue be related to the fact that I use checkpointing, because I have "updateStateByKey" transformation in my code. It is set to "/tmp", but I always get a warning that "when run in cluster mode, "/tmp" needs to change. How should I set it?
Can that be the reason of my problem?
Thank you
Upvotes: 1
Views: 746
Reputation: 687
According to log you have provided, it may not because of properties file but check this.
spark-submit
only copies jar file to driver when running in cluster mode, so if your application tries to read properties file kept in the system from where you are running spark-submit, driver can not find it when running in cluster mode.
reading from properties file works in client mode because driver starts at the same machine where your are executing spark-submit.
You can copy properties to same directory in all nodes or keep properties file in cassandra file system and read from there.
Upvotes: 0