Reputation: 2818

Apache Spark job failed with FileNotFoundExceptoin

I have a spark cluster consists of 5 nodes and I have a spark job written in Java that read set of files from a directory and send the content to Kafka.

When I was testing the job locally, everything was working fine.

When I tried to submit the job to the cluster, the job failed with FileNoTFoundException

The files need to be processed exists in a directory mounted on all the 5 nodes so I am sure the file path appears in the exception exists.

Here is the exception appears while submitting the job

java.io.FileNotFoundException: File file:/home/me/shared/input_1.txt does not exist
    at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:534)
    at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:747)
    at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:524)
    at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
    at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:140)
    at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:341)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:766)
    at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:108)
    at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
    at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:239)
    at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:216)
    at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
    at org.apache.spark.scheduler.Task.run(Task.scala:88)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

The directory /home/me/shared/ is mounted on all the 5 nodes.

EDIT:

Here is the command I am using to submit the job

bin$ ./spark-submit --total-executor-cores 20 --executor-memory 5G --class org.company.java.FileMigration.FileSparkMigrator --master spark://spark-master:7077 /home/me/FileMigrator-0.1.1-jar-with-dependencies.jar /home/me/shared kafka01,kafka02,kafka03,kafka04,kafka05 kafka_topic

I faced a weird behavior. I submitted the job while the directory contains only one file, the exception is thrown on the driver but the file is processed successfully. Then, I added another file, the same behavior occurred. But, once I added the third file, the exception is thrown and the job failed.

EDIT 2

After some tries, we discovered that there was a problem in the mounted directory that caused this weird behavior.

Upvotes: 3

Answers (2)

Fanooos

Reputation: 2818

Here is what solve the problem for me. It is weird and I have no idea what the actual problem was.

Simply I asked the sysadmin to mount another directory instead of the one I was using. After that, everything worked fine.

Is seems there was an issue in the old mounted directory but I have no idea what was the actual problem.

Upvotes: 0

Tim

Reputation: 3725

Spark defaults to HDFS by default. This looks like an NFS file, so try to access it with: file:///home/me/shared/input_1.txt

Yes, three /!

Upvotes: 1

Apache Spark job failed with FileNotFoundExceptoin

EDIT 2

After some tries, we discovered that there was a problem in the mounted directory that caused this weird behavior.

Answers (2)

Related Questions