HHH
HHH

Reputation: 6475

How to read/write to HDFS from the driver in spark

I'd like to know whether it is possible to access the HDFS from the driver in a Spark application. That means, how to read/write a file from/to HDFS in the driver program. One possible solution is to read a file as a RDD (sc.textFile) and then collect it in the driver. However, this is not I'm looking for.

Upvotes: 1

Views: 1988

Answers (2)

mgaido
mgaido

Reputation: 3055

If you want to access directly HDFS from the driver you can simply do (in Scala):

val hdfs = FileSystem.get(sc.hadoopConfiguration)

Then you can use the so created variable hdfs to access directly HDFS as a file system without using Spark.

(In the code snapshot I assumed you have a SparkContext called sc properly configured)

Upvotes: 4

Sandeep Purohit
Sandeep Purohit

Reputation: 3692

Simply collect all data at driver with collect action and use java api of hdfs to write it on hdfs.

Upvotes: -3

Related Questions