Luan Carvalho
Luan Carvalho

Reputation: 320

Why External scheduler cannot be instantiated running spark on minikube/kubernetes?

I'm trying to run spark on kubernetes(using minikube with VirtualBox or docker driver, I tested in both) and now I have an error that I don't know how to solve.

The error is a "SparkException: External scheduler cannot be instantiated". I'm new in Kubernetes world, so I really don't know if this is a newbie error, but trying to resolve by myself I failed.

Please help me.

In the next lines, follow the command and the error.

I use this spark submit command:

 spark-submit --master k8s://https://192.168.99.102:8443 \
--deploy-mode cluster \
--name spark-pi \
--class org.apache.spark.examples.SparkPi \
--conf spark.executor.instances=2 \
--executor-memory 1024m \
--conf spark.kubernetes.container.image=spark:latest \
local:///opt/spark/examples/jars/spark-examples_2.12-3.0.0.jar

And i got this error in the pod:

20/06/23 15:24:56 INFO SparkContext: Submitted application: Spark Pi
20/06/23 15:24:56 INFO SecurityManager: Changing view acls to: 185,luan
20/06/23 15:24:56 INFO SecurityManager: Changing modify acls to: 185,luan
20/06/23 15:24:56 INFO SecurityManager: Changing view acls groups to: 
20/06/23 15:24:56 INFO SecurityManager: Changing modify acls groups to: 
20/06/23 15:24:56 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(185, luan); groups with view permissions: Set(); users  with modify permissions: Set(185, luan); groups with modify permissions: Set()
20/06/23 15:24:57 INFO Utils: Successfully started service 'sparkDriver' on port 7078.
20/06/23 15:24:57 INFO SparkEnv: Registering MapOutputTracker
20/06/23 15:24:57 INFO SparkEnv: Registering BlockManagerMaster
20/06/23 15:24:57 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
20/06/23 15:24:57 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
20/06/23 15:24:57 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
20/06/23 15:24:57 INFO DiskBlockManager: Created local directory at /var/data/spark-4f7b787b-ec75-4ae5-b703-f9f90ef130cb/blockmgr-1ef6d02a-48f6-4bd7-9d7d-fe2518850f5e
20/06/23 15:24:57 INFO MemoryStore: MemoryStore started with capacity 413.9 MiB
20/06/23 15:24:57 INFO SparkEnv: Registering OutputCommitCoordinator
20/06/23 15:24:57 INFO Utils: Successfully started service 'SparkUI' on port 4040.
20/06/23 15:24:57 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://spark-pi-a8278472e1c83236-driver-svc.default.svc:4040
20/06/23 15:24:57 INFO SparkContext: Added JAR local:///opt/spark/examples/jars/spark-examples_2.12-3.0.0.jar at file:/opt/spark/examples/jars/spark-examples_2.12-3.0.0.jar with timestamp 1592925897650
20/06/23 15:24:57 WARN SparkContext: The jar local:///opt/spark/examples/jars/spark-examples_2.12-3.0.0.jar has been added already. Overwriting of added jars is not supported in the current version.
20/06/23 15:24:57 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file
20/06/23 15:24:58 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: External scheduler cannot be instantiated
    at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2934)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:528)
    at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2555)
    at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$1(SparkSession.scala:930)
    at scala.Option.getOrElse(Option.scala:189)
    at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921)
    at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:30)
    at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:928)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://kubernetes.default.svc/api/v1/namespaces/default/pods/spark-pi-a8278472e1c83236-driver. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods "spark-pi-a8278472e1c83236-driver" is forbidden: User "system:serviceaccount:default:default" cannot get resource "pods" in API group "" in the namespace "default".
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:568)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:505)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:471)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:430)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:395)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:376)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:845)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:214)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:168)
    at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$driverPod$1(ExecutorPodsAllocator.scala:59)
    at scala.Option.map(Option.scala:230)
    at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.<init>(ExecutorPodsAllocator.scala:58)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:113)
    at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2928)
    ... 19 more
20/06/23 15:24:58 INFO SparkUI: Stopped Spark web UI at http://spark-pi-a8278472e1c83236-driver-svc.default.svc:4040
20/06/23 15:24:58 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
20/06/23 15:24:58 INFO MemoryStore: MemoryStore cleared
20/06/23 15:24:58 INFO BlockManager: BlockManager stopped
20/06/23 15:24:58 INFO BlockManagerMaster: BlockManagerMaster stopped
20/06/23 15:24:58 WARN MetricsSystem: Stopping a MetricsSystem that is not running
20/06/23 15:24:58 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
20/06/23 15:24:58 INFO SparkContext: Successfully stopped SparkContext
Exception in thread "main" org.apache.spark.SparkException: External scheduler cannot be instantiated
    at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2934)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:528)
    at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2555)
    at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$1(SparkSession.scala:930)
    at scala.Option.getOrElse(Option.scala:189)
    at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921)
    at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:30)
    at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:928)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://kubernetes.default.svc/api/v1/namespaces/default/pods/spark-pi-a8278472e1c83236-driver. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods "spark-pi-a8278472e1c83236-driver" is forbidden: User "system:serviceaccount:default:default" cannot get resource "pods" in API group "" in the namespace "default".
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:568)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:505)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:471)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:430)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:395)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:376)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:845)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:214)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:168)
    at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$driverPod$1(ExecutorPodsAllocator.scala:59)
    at scala.Option.map(Option.scala:230)
    at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.<init>(ExecutorPodsAllocator.scala:58)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:113)
    at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2928)
    ... 19 more
20/06/23 15:24:58 INFO ShutdownHookManager: Shutdown hook called
20/06/23 15:24:58 INFO ShutdownHookManager: Deleting directory /var/data/spark-4f7b787b-ec75-4ae5-b703-f9f90ef130cb/spark-616edc5e-b42d-4c77-9f11-8465b4d69642
20/06/23 15:24:58 INFO ShutdownHookManager: Deleting directory /tmp/spark-71e3bd59-3b7d-4d72-a442-b0ad0c7092fb

Thank You!

Ps: Im using Spark 3.0 - new version, minikube - 1.11.0

Upvotes: 7

Views: 5910

Answers (1)

Rico
Rico

Reputation: 61669

Based on the log file:

Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods "spark-pi-a8278472e1c83236-driver" is forbidden: User "system:serviceaccount:default:default" cannot get resource "pods" in API group "" in the namespace "default".

It looks like the default:default service account doesn't have edit permissions. You can run this to create the ClusterRoleBinding to add the permissions.

$ kubectl create clusterrolebinding default \
  --clusterrole=edit --serviceaccount=default:default --namespace=default

You can take a look at this cheat sheet.

Upvotes: 11

Related Questions