Rahul
Rahul

Reputation: 2374

How to handle dates in Spark using Scala?

I have a flat file that looks like as mentioned below.

id,name,desg,tdate
1,Alex,Business Manager,2016-01-01

I am using the Spark Context to read this file as follows.

val myFile = sc.textFile("file.txt")

I want to generate a Spark DataFrame from this file and I am using the following code to do so.

case class Record(id: Int, name: String,desg:String,tdate:String)

val myFile1 = myFile.map(x=>x.split(",")).map {
  case Array(id, name,desg,tdate) => Record(id.toInt, name,desg,tdate)
} 

myFile1.toDF()

This is giving me a DataFrame with id as int and rest of the columns as String.

I want the last column, tdate, to be casted to date type.

How can I do that?

Upvotes: 5

Views: 4772

Answers (1)

mgaido
mgaido

Reputation: 3055

You just need to convert the String to a java.sql.Date object. Then, your code can simply become:

import java.sql.Date
case class Record(id: Int, name: String,desg:String,tdate:Date)

val myFile1 = myFile.map(x=>x.split(",")).map {
  case Array(id, name,desg,tdate) => Record(id.toInt, name,desg,Date.valueOf(tdate))
} 

myFile1.toDF()

Upvotes: 8

Related Questions