Himanshu
Himanshu

Reputation: 148

unix_timestamp() function changes hour in scala spark

I am using Spark 2.1.0 in unix and found a weird issue where unix_timestamp is changing hour for one particular timestamp, I created a dataframe as below

For 1st record in df2 is having "20170312020200" as String, which I later cast into timestamp in df3, the hours should be 02 but instead it comes as 03 in df3. But 2nd record doesn't have issue in converting string to timestamp.

This doesn't happen when I run the app using Intellij in local system. This is happening in spark-submit as well when we run our app.

Upvotes: 0

Views: 10911

Answers (2)

vaquar khan
vaquar khan

Reputation: 11469

I am using Spark 2 , you can see following results , your issue not related to unix_timestamp or Spark version , please check your data.

import org.apache.spark.sql.functions.unix_timestamp

val df2 = sc.parallelize(Seq(
      (10, "date", "20170312020200"), (10, "date", "20170312050200"))
    ).toDF("id ", "somthing ", "datee")

df2.show()

val df3=df2.withColumn("datee", unix_timestamp($"datee", "yyyyMMddHHmmss").cast("timestamp"))


df3.show()  



+---+---------+--------------+
|id |somthing |         datee|
+---+---------+--------------+
| 10|     date|20170312020200|
| 10|     date|20170312050200|
+---+---------+--------------+

+---+---------+-------------------+
|id |somthing |              datee|
+---+---------+-------------------+
| 10|     date|2017-03-12 02:02:00|
| 10|     date|2017-03-12 05:02:00|
+---+---------+-------------------+

import org.apache.spark.sql.functions.unix_timestamp
df2: org.apache.spark.sql.DataFrame = [id : int, somthing : string ... 1 more field]
df3: org.apache.spark.sql.DataFrame = [id : int, somthing : string ... 1 more field]

Upvotes: 1

Joe K
Joe K

Reputation: 18434

March 12, 2017 2:02 AM is not a valid time in a lot of time zones. That was when daylight savings kicked in and the clock skipped from 1:59:59 to 3:00:00 in the US.

My guess is your local machine and the spark cluster have different system time zone settings.

Upvotes: 4

Related Questions