Reputation: 31
I am trying to submit the Spark application to minikube k8s cluster (Spark Version used : 2.4.3) using below command:
spark-submit \
--master <K8S_MASTER> \
--deploy-mode cluster \
--conf spark.executor.instances=2 \
--conf spark.kubernetes.container.image=<my docker image> \
--conf spark.kubernetes.driver.pod.name=spark-py-driver \
--conf spark.executor.memory=2g \
--conf spark.driver.memory=2g \
local:///home/proj/app/run.py <arguments>
Please note that the python script run.py exists in my docker image in the same path Once I do the Spark submit, the Spark job starts and the driver job gets killed. i could see only the below logs in the Driver pod
[FATAL tini (6)] exec driver-py failed: No such file or directory
I have verified the execution of pyspark job by doing a docker run on the docker image and is able to see that the above python code gets executed.
These are the events for the failed driver pod
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 52m default-scheduler Successfully assigned ***-develop/run-py-1590847453453-driver to minikube
Warning FailedMount 52m kubelet, minikube MountVolume.SetUp failed for volume "spark-conf-volume" : configmap "run-py-1590847453453-driver-conf-map" not found
Normal Pulled 52m kubelet, minikube Container image "******************:latest" already present on machine
Normal Created 52m kubelet, minikube Created container spark-kubernetes-driver
Normal Started 52m kubelet, minikube Started container spark-kubernetes-driver
Upvotes: 2
Views: 2005
Reputation: 1
The spark submit used is not a 3.0.0 version. You also need to change the spark installation that uses the spark submit to version 3.0.0.
Upvotes: 0
Reputation: 31
I am using one of the base images from my org. But issue regarding the mount is only a Warning and the pod got successfully assigned after that.
FROM <project_repo>/<proj>/${SPARK_ALPINE_BUILD}
ENV SPARK_OPTS --driver-java-options=-Dlog4j.logLevel=info
ENV SPARK_MASTER "spark://spark-master:7077"
ADD https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.38/mysql-connector-java-5.1.38.jar $SPARK_HOME/jars
ADD https://repo1.maven.org/maven2/com/datastax/spark/spark-cassandra-connector_2.11/2.3.2/spark-cassandra-connector_2.11-2.3.2.jar $SPARK_HOME/jars
USER root
# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
WORKDIR /home/<proj>/app
# copy files
COPY src/configs ./configs
COPY src/dependencies ./dependencies
COPY src/jobs ./jobs
COPY src/run.py ./run.py
COPY run.sh ./run.sh
COPY src/requirements.txt .
# install packages here
RUN set -e; \
pip install --no-cache-dir -r requirements.txt;
Upvotes: 1