Peter
Peter

Reputation: 1082

Access files in resources directory in JAR from Apache Spark Streaming context

I have a Java application I have written as a Spark Streaming job which requires some text resources that I have included in the jar in a resources directory (using the default Maven directory structure). With unit tests I have no problem accessing these files but when I run my program with spark-submit I get a FileNotFoundException. How do I access files on the classpath in my JAR when running with spark-submit?

The code I am currently using to access my file looks roughly like this:

    InputStream input;

    try {
        URL url = this.getClass().getClassLoader().getResource("my file");
        if (url == null) {
            throw new IOException("file does not exist");
        }
        String path = url.getPath();
        input = new FileInputStream(path);
    } catch(IOException e) {
        throw new RuntimeException(e);
    }

Thanks.

Note this is not a duplicate of Reading a resource file from within jar (which was suggested), because this code works when run locally. It only fails when run in a Spark cluster.

Upvotes: 6

Views: 4388

Answers (1)

Peter
Peter

Reputation: 1082

I fixed this by accessing the resources directory a different (and significantly less silly) way:

input = MyClass.class.getResourceAsStream("/my file");

Upvotes: 3

Related Questions