Reputation: 343
There is a lot I clearly don't understand about Spark, Spark Jobserver, and Mesosphere's DC/OS. But I very much like the Jobserver project, and also very much like our DC/OS cluster, and would really like to get them running together.
Throwing the Docker container into marathon file, like this example, does not work. I thought maybe this was all due to me not knowing what SPARK_MASTER url to pass in (which I still don't know, any help there would be greatly appreciated), but then I tried removing that from the marathon file, which should still run the project in local mode, and that also doesn't work. Which makes me realize, beyond not knowing how to connect this jobserver to my DCOS spark dispatcher, I also just don't know why this Docker container will fail on the cluster, but not on my local machine, even when it is not passed any arguments.
My logs do not show much, and the Docker container exits with a status of 137 after the following in stdout:
LOG_DIR empty; logging will go to /tmp/job-server
Which, when I run things locally, is the last log before it continues to run log4j into my stdout and tell me that the jobserver is starting up. I see the following in stderr:
app/server_start.sh: line 54: 15 Killed $SPARK_HOME/bin/spark-submit --class $MAIN --driver-memory $JOBSERVER_MEMORY --conf "spark.executor.extraJavaOptions=$LOGGING_OPTS" --driver-java-options "$GC_OPTS $JAVA_OPTS $LOGGING_OPTS $CONFIG_OVERRIDES" $@ $appdir/spark-job-server.jar $conffile
Which just seems to suggest that the server_start.sh is running from the spark jobserver docker, and that script is for some reason dying?
I stripped my marathon file all the way down to this, which is still giving me the same errors:
{
"id": "/jobserver",
"cpus": 0.5,
"mem": 100,
"ports": [0],
"instances": 1,
"container": {
"type": "DOCKER",
"docker": {
"image": "velvia/spark-jobserver:0.6.2.mesos-0.28.1.spark-1.6.1"
}
}
}
Any help would be greatly appreciated.
Upvotes: 1
Views: 709
Reputation: 1353
The following worked for me sometime when I tried it.
{
"id": "/spark.jobserver",
"cmd": null,
"cpus": 2,
"mem": 2048,
"disk": 50,
"instances": 1,
"container": {
"type": "DOCKER",
"volumes": [],
"docker": {
"image": "velvia/spark-jobserver:0.6.2.mesos-0.28.1.spark-1.6.1",
"network": "BRIDGE",
"portMappings": [
{
"containerPort": 8090,
"hostPort": 0,
"servicePort": 10001,
"protocol": "tcp",
"labels": {}
}
],
"privileged": false,
"parameters": [],
"forcePullImage": false
}
},
"env": {
"SPARK_MASTER": "mesos://zk://10.29.83.3:2181,10.29.83.4:2181/mesos"
},
"portDefinitions": [
{
"port": 10001,
"protocol": "tcp",
"labels": {}
}
]
}
Upvotes: 3