shashank
shashank

Reputation: 399

Saving and Overwriting a file in Spark Scala

I have a text file where my first column is represented with table name and the second column is represented with date. The delimiter between two columns is represented by space. The data is represented as follows

employee.txt

organization 4-15-2018
employee 5-15-2018

My requirement is to read the file and update the date column based on the business logic and save/overwrite the file. Below is my code

object Employee {
  def main(args: Array[String]) {

    val conf = new SparkConf().setMaster("local").setAppName("employeedata")
    val sc = new SparkContext(conf)
    var input = sc.textFile("D:\\employee\\employee.txt")
      .map(line => line.split(' '))
      .map(kvPair => (kvPair(0), kvPair(1)))
      .collectAsMap()

    //Do some operations

    // Do iteration and update the hashmap as follows
    val finalMap = input + (tableName -> updatedDate)

    sc.stop()
  }

How to save/overwrite(if exists) the finalMap in the above scenario?

Upvotes: 1

Views: 1375

Answers (1)

Alper t. Turker
Alper t. Turker

Reputation: 35229

My requirement is to read the file and update the date column based on the business logic and save/overwrite the file.

Never do something like this directly. Always:

  • Write data to a temporary storage first.
  • Delete original using standard file system tools.
  • Rename temporary output using standard file system tools.

An attempt to overwrite data directly will, with high probability, result in a partial or complete data loss.

Upvotes: 2

Related Questions