tuning spark job parameters

Question

I have at my disposal ~100 CPU nodes, each with 192 cores and 1.5TB RAM. I am running some large Spark jobs (each on 40 instances), but I'm really not sure about what's the best way to tune Spark parameters, this is typically what I use that I found on some Spark tutorials:

    --conf spark.driver.cores=16 \
    --conf spark.driver.memory="256G" \
    --conf spark.driver.maxResultSize="16g" \
    --conf spark.sql.shuffle.partitions=4000 \
    --conf spark.kubernetes.executor.limit.cores=32 \
    --conf spark.kubernetes.driver.limit.cores=32 \
    --conf spark.executor.cores=15 \
    --conf spark.executor.memory="256G" \

The Spark jobs are running successfully, but they take very long and I feel the parameters are not reflecting the hardware I have. Any suggestions?

If I understand correctly, this thread How to tune spark executor number, cores and executor memory? suggests to set only 5 cores per executor, which means I could use 38 executors per node? What I'm also struggling to understand is the relation between driver and executor cores (and also kubernetes cores).

tuning spark job parameters

Answers (1)

Related Questions