Reputation: 5451
Is it possible to read a file using SFTP in spark?
I tried using val df = sc.textFile("sftp://user:password@host/home/user/sample.csv")
But getting the below error
scala> df.count
java.io.IOException: No FileSystem for scheme: sftp
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
Is there any way to read a file using sftp in spark?
Upvotes: 2
Views: 6592
Reputation: 5451
We've created a very simple spark SFTP connector to do that.
Here is the github link https://github.com/springml/spark-sftp
And it has been published to spark-packages as well. http://spark-packages.org/package/springml/spark-sftp
Upvotes: 3
Reputation: 330093
It looks like it is not possible at this moment (Spark 1.6, maximum profile hadoop-2.6). SFTP support will be introduced in Hadoop 2.8 (see HADOOP-5732).
Upvotes: 2