user6308605
user6308605

Reputation: 731

spark submit to minikube error related to krb5.conf

I'm following this file to spark-submit to minikube : https://gist.github.com/jjstill/8099669931cdfbb90ce6f4c307971514

This is my modified version called spark-minikube.sh:

minikube --memory 8192 --cpus 3 start

kubectl create namespace spark

kubectl create serviceaccount spark-serviceaccount --namespace spark
kubectl create clusterrolebinding spark-rolebinding --clusterrole=edit --serviceaccount=spark:spark-serviceaccount --namespace=spark

cd $SPARK_HOME

# Asking local environment to use Docker daemon inside the Minikube
eval $(minikube docker-env)

# docker build -t spark:latest -f /path/to/Dockerfile .
IMG_NAME=asia.gcr.io/project-id/my-image:latest

# Submitting SparkPi example job
# $KUBERNETES_MASTER can be taken from output of kubectl cluster-info
KUBERNETES_MASTER=https://127.0.0.1:<port_number>

spark-submit --master k8s://$KUBERNETES_MASTER \
                 --deploy-mode cluster \
                 --name spark-pi \
                 --jars jars/gcs-connector-hadoop2-2.0.1-shaded.jar,jars/spark-bigquery-latest_2.12.jar \
                 --conf spark.executor.instances=2 \
                 --conf spark.kubernetes.namespace=spark \
                 --conf spark.kubernetes.container.image=${IMG_NAME} \
                 --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-serviceaccount \
                 local:///app/main.py

I'm getting this error:

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/usr/local/Cellar/apache-spark/3.1.1/libexec/jars/spark-unsafe_2.12-3.1.1.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
21/05/23 17:33:44 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
21/05/23 17:33:45 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file
21/05/23 17:33:45 INFO KerberosConfDriverFeatureStep: You have not specified a krb5.conf file locally or via a ConfigMap. Make sure that you have the krb5.conf locally on the driver image.
Exception in thread "main" org.apache.spark.SparkException: Please specify spark.kubernetes.file.upload.path property.
        at org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:299)
        at org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:248)
        at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at scala.collection.TraversableLike.map(TraversableLike.scala:238)
        at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
        at scala.collection.AbstractTraversable.map(Traversable.scala:108)
        at org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:247)
        at org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:173)
        at scala.collection.immutable.List.foreach(List.scala:392)
        at org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:164)
        at org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$3(KubernetesDriverBuilder.scala:60)
        at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
        at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
        at scala.collection.immutable.List.foldLeft(List.scala:89)
        at org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:58)
        at org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:106)
        at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$3(KubernetesClientApplication.scala:213)
        at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$3$adapted(KubernetesClientApplication.scala:207)
        at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2611)
        at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:207)
        at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:179)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1030)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1039)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
21/05/23 17:33:45 INFO ShutdownHookManager: Shutdown hook called
21/05/23 17:33:45 INFO ShutdownHookManager: Deleting directory /private/var/folders/t2/psknqk615q7chtsr41qymznm0000gp/T/spark-100c4448-32bb-4fac-b5b5-d7a1b20d8525

Maybe it has something to do with this error message :

INFO KerberosConfDriverFeatureStep: You have not specified a krb5.conf file locally or via a ConfigMap. Make sure that you have the krb5.conf locally on the driver image.

How can I fix this? I'm not able to locate the krb5.conf anyway...

Upvotes: 1

Views: 1763

Answers (1)

eugen-fried
eugen-fried

Reputation: 2173

despite the cryptic message, what it actually wants to say is that you have not specified where to get your dependency jars from. From official documentation:

If your application’s dependencies are all hosted in remote locations like HDFS or HTTP servers, they may be referred to by their appropriate remote URIs. Also, application dependencies can be pre-mounted into custom-built Docker images. Those dependencies can be added to the classpath by referencing them with local:// URIs and/or setting the SPARK_EXTRA_CLASSPATH environment variable in your Dockerfiles. The local:// scheme is also required when referring to dependencies in custom-built Docker images in spark-submit. We support dependencies from the submission client’s local file system using the file:// scheme or without a scheme (using a full path), where the destination should be a Hadoop compatible filesystem.

So to resolve your issue you just need to prepend your jars argument with local://

--jars local:///full/path/to/jars/gcs-connector-hadoop2-2.0.1-shaded.jar,local:///full/path/to/jars/spark-bigquery-latest_2.12.jar

Upvotes: 3

Related Questions