Lorenz
Lorenz

Reputation: 133

spark-submit on local microK8's Kubernetes cluster fails with: java.security.cert.CertPathValidatorException

I am trying to to submit the spark-pi example on my local microK8's Kubernetes cluster with the following command:

$SPARK_HOME/bin/spark-submit \
 --master k8s://127.0.0.1:16443 \
 --deploy-mode cluster \
 --name spark-pi \
 --class org.apache.spark.examples.SparkPi \
 --conf spark.executor.instances=2 \
 --conf spark.kubernetes.container.image=localhost:32000/spark-base:registry \
 --conf spark.app.name=spark-pi \
 --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
 --conf spark.kubernetes.authenticate.caCertFile=/var/snap/microk8s/current/certs/ca.crt \
 local:///opt/spark/examples/jars/spark-examples_2.12-3.1.1.jar

Unfortunately I get this error which I am not able to resolve:

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/spark/jars/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2021-03-31 17:56:05,184 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2021-03-31 17:56:05,362 INFO k8s.SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file
2021-03-31 17:56:06,209 INFO features.KerberosConfDriverFeatureStep: You have not specified a krb5.conf file locally or via a ConfigMap. Make sure that you have the krb5.conf locally on the driver image.
Exception in thread "main" io.fabric8.kubernetes.client.KubernetesClientException: Operation: [create]  for kind: [Pod]  with name: [null]  in namespace: [default]  failed.
        at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
        at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72)
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:349)
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:84)
        at org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:139)
        at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$3(KubernetesClientApplication.scala:213)
        at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$3$adapted(KubernetesClientApplication.scala:207)
        at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2611)
        at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:207)
        at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:179)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1030)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1039)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: javax.net.ssl.SSLHandshakeException: PKIX path validation failed: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchors
        at sun.security.ssl.Alert.createSSLException(Alert.java:131)
        at sun.security.ssl.TransportContext.fatal(TransportContext.java:324)
        at sun.security.ssl.TransportContext.fatal(TransportContext.java:267)
        at sun.security.ssl.TransportContext.fatal(TransportContext.java:262)
        at sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:654)
        at sun.security.ssl.CertificateMessage$T12CertificateConsumer.onCertificate(CertificateMessage.java:473)
        at sun.security.ssl.CertificateMessage$T12CertificateConsumer.consume(CertificateMessage.java:369)
        at sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:377)
        at sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:444)
        at sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:422)
        at sun.security.ssl.TransportContext.dispatch(TransportContext.java:182)
        at sun.security.ssl.SSLTransport.decode(SSLTransport.java:149)
        at sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1143)
        at sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1054)
        at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:394)
        at okhttp3.internal.connection.RealConnection.connectTls(RealConnection.java:320)
        at okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:284)
        at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:169)
        at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:258)
        at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:135)
        at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:114)
        at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:127)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        at io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:135)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        at io.fabric8.kubernetes.client.utils.OIDCTokenRefreshInterceptor.intercept(OIDCTokenRefreshInterceptor.java:41)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createHttpClient$3(HttpClientUtils.java:151)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:257)
        at okhttp3.RealCall.execute(RealCall.java:93)
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:490)
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:451)
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:252)
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:879)
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:341)
        ... 14 more
Caused by: sun.security.validator.ValidatorException: PKIX path validation failed: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchors
        at sun.security.validator.PKIXValidator.doValidate(PKIXValidator.java:386)
        at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:291)
        at sun.security.validator.Validator.validate(Validator.java:271)
        at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:315)
        at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:223)
        at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:129)
        at sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:638)
        ... 60 more
Caused by: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchors
        at sun.security.provider.certpath.PKIXCertPathValidator.validate(PKIXCertPathValidator.java:154)
        at sun.security.provider.certpath.PKIXCertPathValidator.engineValidate(PKIXCertPathValidator.java:80)
        at java.security.cert.CertPathValidator.validate(CertPathValidator.java:292)
        at sun.security.validator.PKIXValidator.doValidate(PKIXValidator.java:381)
        ... 66 more
2021-03-31 17:56:06,643 INFO util.ShutdownHookManager: Shutdown hook called
2021-03-31 17:56:06,644 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-77846d32-704c-4d50-b732-6ce5c83f494a

First I suspected the SSL encryption for the microK8 port (16443) to be the problem, but adding the CA cert file did not resolve the issue.

Spark 3.1.1; MicroK8s v1.3.7; Ubuntu 18.04 LTS

Upvotes: 0

Views: 1594

Answers (2)

Joseph Hui
Joseph Hui

Reputation: 587

I am struggling to build my Apache Spark cluster with MicroK8s and encountered the same error when spark-submitting the spark-pi example. The problem seems to be the CA cert of the MicroK8s cluster is not trusted by the JVM that executes spark-submit. I tried including the spark.kubernetes.authenticate.caCertFile parameter that points to the CA cert copied from the cluster but it didn't work. Finally, I imported the CA cert of the cluster into the JVM and solved the problem.

Upvotes: 0

Hussein Awala
Hussein Awala

Reputation: 5096

I had the same problem with OVH managed Kubernetes service, where the driver couldn’t get the certificate from the Kubernetes API, and it generated an error with the certificate, and when I added the certificate path as an argument to spark-sumbit, I git the same error as yours.
This problem was solved by providing the full chain of my certificate (pem file) and not only the certificate file (crt), where the driver considered the certificate as self signed and not trusted.
You can create a Kubernetes secret from the full certificate chain file, then mount it to all your applications pods, and finally add the mounting path as an argument to the spark-submit command.

Upvotes: 1

Related Questions