user7079832
user7079832

Reputation:

pyspark JOB fails with "No space left on device"

I am on a standalone cluster of Master+3WorkerNodes, When running a job(BIG), I am facing an issue of "No space left on device".

I tried getting help of Why does a job fail with "No space left on device", but df says otherwise? and set the variable in MASTER's spark-defaults.conf

spark.local.dir            SOME/DIR/WHERE/YOU/HAVE/SPACE

then restarted the cluster. But noticed that after changing that also, it is still pointing to /tmp(saw memory usage while job was running by df -h) for temporary shuffle store instead of pointing to directory I set in defaults.conf(I can see this directory in webUI's environment TAB).

WHY /tmp is still pointed, any IDEA ? do i need to set anything-else-anywhere??

Also followed Spark:java.io.IOException: No space left on device and get: I need to set below property in spark-evn.sh

SPARK_JAVA_OPTS+=" -Dspark.local.dir=/mnt/spark,/mnt2/spark -Dhadoop.tmp.dir=/mnt/ephemeral-hdfs"

export SPARK_JAVA_OPTS

What is "/mnt/spark" and "/mnt/ephemeral-hdfs" path denotes?? And do i need to set it on master's spark-env.sh or on every worker-Node also.

Pleas Help. Thanks...

Upvotes: 3

Views: 2060

Answers (1)

user7079832
user7079832

Reputation:

Ok, got the solution, I think setting "spark.local.dir" will be overriden by saprk defaults i:e /tmp path.

But setting the below 2 variables in each of master and worker's "spark-env.sh" worked.

export SPARK_WORKER_DIR=dir_you_have_enough_Space
export SPARK_LOCAL_DIRS=dir_you_have_enough_Space

Hope it'll help somebody-someday..:)

Upvotes: 7

Related Questions