user1361815
user1361815

Reputation: 353

understanding spark.default.parallelism in local mode

Hi just for my understanding on the spark.default.parallelism parameter

Given that documentation :

https://spark.apache.org/docs/latest/configuration.html

I see that this variable should be the number of cores in my machine. So i have 4 cores :

nproc
4

But this :

 println("TEST---> " + sparkSession.sparkContext.defaultParallelism )

This this command :

spark-submit \
  --class PartitioningTest \
  --master local \
  --driver-java-options "-Dlog4j.configuration=application.properties" \
  --driver-class-path $JARFILE \
  $JARFILE

prints out

TEST---> 1

As the doc says i was expecting 4

Thanks

Upvotes: 0

Views: 1123

Answers (1)

Matteo Cossu
Matteo Cossu

Reputation: 99

--master local runs Spark with one thread, you should use local[*] to use all your cores.

Upvotes: 2

Related Questions