Tomasz Sosiński
Tomasz Sosiński

Reputation: 869

Flink cluster configuration issue - no slots available

I have deployed Flink cluster with configuration for parallelism as follows:

jobmanager.heap.mb: 2048
taskmanager.heap.mb: 2048
taskmanager.numberOfTaskSlots: 5
parallelism.default: 2

But if I try to run any example or jar even with -p flag I receive the following error:

org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
Not enough free slots available to run the job. You can decrease the operator parallelism or increase the number of slots per TaskManager in the configuration. 
Task to schedule: < Attempt #1 (Source: Custom Source -> Sink: Unnamed (1/1)) @ (unassigned) - [SCHEDULED] > with groupID < 22f48c24254702e4d1674069e455c81a > in sharing group < SlotSharingGroup [22f48c24254702e4d1674069e455c81a] >. Resources available to scheduler: 
Number of instances=0, total number of slots=0, available slots=0
        at org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:255)
        at org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleImmediately(Scheduler.java:131)
        at org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:303)
        at org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:453)
        at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:326)
        at org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:742)
        at org.apache.flink.runtime.executiongraph.ExecutionGraph.restart(ExecutionGraph.java:889)
        at org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy$1.call(FixedDelayRestartStrategy.java:80)
        at akka.dispatch.Futures$$anonfun$future$1.apply(Future.scala:94)
        at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
        at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
        at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

Which should not come as a surprise, as dashboard shows:

enter image description here

I tried restarting a cluster for several times, but it seems not to use the configuration.

Upvotes: 11

Views: 13343

Answers (3)

Rajeev Rathor
Rajeev Rathor

Reputation: 1932

Finally I got solution for this FLINK issue is my case. First I will explain the Root cause and then explain solution.

Root Cause: Could not create the Java Virtual Machine.

Please Check the Flink logs and tail the task-executor logs

tail -500f flink-root-taskexecutor-3-osboxes.out Found following logs.

Invalid maximum direct memory size: -XX:MaxDirectMemorySize=8388607T
The specified size exceeds the maximum representable size.
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

Why this is coming because Java version is not correct. OS is 64 bit based but I installed jdk 32 bits.

Solution: 1. Install correct JDK-1.8 64 bit [After installation error in task-executor got disappear]

  1. Edit flink-conf.yaml file update taskmanager.numberOfTaskSlots: 10 parallelism.default: 1

My problem got resolved and Flink cluster running perfectly in local and on cloud.

enter image description here

Upvotes: 0

GoForce5500
GoForce5500

Reputation: 131

I got the same issue, and I remember when to Spark running into problems, that's because I newly installed JDK11 for tests, that change my env var JAVA_HOME to /Library/Java/JavaVirtualMachines/jdk-11.jdk/Contents/Home.

So I set JAVA_HOME back to JDK8 use: export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home

and everything runs smoothly. That path is for my Mac, you can find your own JAVA_HOME. Hope that will help.

Upvotes: 4

Mudit bhaintwal
Mudit bhaintwal

Reputation: 555

Exception simply means there is no Task manger hence no slots available to run the job. Reason for Task manager going done can be many e.g. an run time exception of miss configuration. Just check the logs for exact reason. You need to restart the cluster and when task managers are available in dashboard run the job again. You can have proper restart strategy defined in config like FIXED delay restart so that job will retry in case of genuine failure.

Upvotes: 0

Related Questions