Reputation: 59
I am trying to read a CSV file kept in my local filesystem in UNIX, while running it in cluster mode it's not able to find the CSV file.
In local mode, it can read both HDFS and file:/// files. However, in cluster mode, it can only read HDFS file.
Is there any suitable way to read without copying it into HDFS?
Upvotes: 0
Views: 429
Reputation: 2091
Remember that the executor needs to be able to access the file, so you have to take a stand from the executor nodes. As you mention HDFS, it means that the executor nodes must have access to your HDFS cluster.
If you want the Spark cluster to access a local file, consider NFS/SMB etc. However, something will end up copying the data.
I can update my answer if you add more details on your architecture.
Upvotes: 0