Karthi
Karthi

Reputation: 21

Convert the date string with timezone column to timestamp in spark scala

I need to convert the string (date with timezone) column to Timestamp. The converted timestamp column should have the same value which the string field has.

There is a string field it has the date and time offset, I tried to convert that into timestamp data type, it actually converted into UTC format but I want to have the same date and time with the offset as a timestamp data type.

Seq("2019-02-05T18:59:11.0874121+05:30").toDF("date_str")
.select($"date_str")
.withColumn("date_timestamp",$"date_str".cast("timestamp"))
.show(false)

I expect the date_timestamp column should have the "2019-02-05T18:59:11.0874121+05:30" but it actually converted into UTC format "2019-02-05T13:29:11.087+0000".

Upvotes: 1

Views: 1857

Answers (1)

Zack
Zack

Reputation: 2466

I use a udf to convert Strings to Timestamps without any changes.

import java.text.SimpleDateFormat
import java.sql.Timestamp

val convertToTimestamp= (logTimestamp: String) => {
  try {
    // change the date format as needed
    val sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss','SSS")
    val theDate = sdf.parse(logTimestamp)
    new Timestamp(theDate.getTime)
  } catch {
    case _: Exception => null
  }
}

//register for sql
sqlContext.udf.register("convertToTimestamp", convertToTimestamp)
//register for scala 
def convertToTimestampUDF = udf(convertToTimestamp)

val newDfWithTimeStamp = oldDfWithString.select(convertToTimestampUDF($"date_timestamp ").alias("date_timestamp "))

Upvotes: 1

Related Questions