Reputation: 18945
I have a trivially small Spark application written in Java that I am trying to run in a K8s cluster using spark-submit
. I built an image with Spark binaries, my uber-JAR file with all necessary dependencies (in /opt/spark/jars/my.jar
), and a config file (in /opt/spark/conf/some.json
).
In my code, I start with
SparkSession session = SparkSession.builder()
.appName("myapp")
.config("spark.logConf", "true")
.getOrCreate();
Path someFilePath = FileSystems.getDefault().getPath("/opt/spark/conf/some.json");
String someString = new String(Files.readAllBytes(someFilePath));
and get this exception at readAllBytes
from the Spark driver:
java.nio.file.NoSuchFileException: /opt/spark/conf/some.json
If I run my Docker image manually I can definitely see the file /opt/spark/conf/some.json
as I expect. My Spark job runs as root so file permissions should not be a problem.
I have been assuming that, since the same Docker image, with the file indeed present, will be used to start the driver (and executors, but I don't even get to that point), the file should be available to my application. Is that not so? Why wouldn't it see the file?
Upvotes: 0
Views: 1727
Reputation: 1800
You seem to get this exception from one of your worker nodes, not from the container.
Make sure that you've specified all files needed as --files
option for spark-submit
.
spark-submit --master yarn --deploy-mode cluster --files <local files dependecies> ...
https://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management
Upvotes: 1