vsingal5
vsingal5

Reputation: 304

Copy file to hadoop hdfs using scala?

I'm trying to copy a file on my local machine to my hdfs. However, I'm not sure how to do this in scala since the script I'm writing currently writes to a local CSV file. How can I move this file to HDFS using scala?

edit: what I have done now:

val hiveServer = new HiveJDBC
    val file =  new File(TMP_DIR, fileName)
    val firstRow = getFirstRow(tableName, hiveServer)
    val restData = getRestData(tableName, hiveServer)
    withPrintWriter(file) { printWriter => 
      printWriter.write(firstRow) 
      printWriter.write("\n")
      printWriter.write(restData)} 

I now want to store "file" in the HDFS

Upvotes: 3

Views: 4844

Answers (2)

RAMESH VAKATI
RAMESH VAKATI

Reputation: 1

In run method add the code content.

val conf = getConf()
val hdfs = FileSystem.get(conf)
val localInputFilePath = arg(0)
val inputFileName = getFileName(localInputFilePath)

var hdfsDestinationPath = arg(1)
val hdfsDestFilePath = new Path(hdfsDestinationPath + File.separator + inputFileName)

try {
  val inputStream: InputStream = new FileInputStream(localInputFilePath);
  val fsdos: FSDataOutputStream = hdfs.create(hdfsDestFilePath);
  IOUtils.copyBytes(inputStream, fsdos, conf, true);

} catch {
  case fnfe: FileNotFoundException => fnfe.printStackTrace();
  case ioe: IOException            => ioe.printStackTrace();
}

Upvotes: 0

zsxwing
zsxwing

Reputation: 20816

Scala can invoke Hadoop API directly. For example,

    val conf = new Configuration()
    val fs= FileSystem.get(conf)
    val output = fs.create(new Path("/your/path"))
    val writer = new PrintWriter(output)
    try {
        writer.write(firstRow) 
        writer.write("\n")
        writer.write(restData)
    }
    finally {
        writer.close()
    }

Upvotes: 2

Related Questions