Reputation: 3
I have a small cluster of 3 nodes with 12 total cores and 44 GB of memory. I am reading a small text file from hdfs (5 mb ) and running kmeans algorithm on it. I set the number of executors to 3 and partitioned my text file into three partitions. The application UI shows that only one of the executors is running all the tasks. Here is the screenshot of the application GUIenter image description here And here is the Jobs UI: enter image description here Can somebody help me figure out why my tasks are all running in one executor while others are idle? Thanks.
Upvotes: 0
Views: 709
Reputation: 1532
try to re-partition your file into 12 partitions. If you have 3 partitions and each node has 4 cores it's enough no run all tasks on 1 node. Spark roughly splits the work as 1 partition on 1 core.
Upvotes: 1