Klun
Klun

Reputation: 54

Spark SQL 2.3.1 : strange behavior with unix_timestamp function

I have a Spark SQL query like that :

SELECT
 CAST(UNIX_TIMESTAMP('2015-03-29T01:11:23Z', "yyyy-MM-dd'T'HH:mm:ss'Z'") AS TIMESTAMP) as column1,
 CAST(UNIX_TIMESTAMP('2015-03-29T02:11:23Z', "yyyy-MM-dd'T'HH:mm:ss'Z'") AS TIMESTAMP) as column2
FROM ...

The first column return the good result. However the second column is always NULL.

What I am missing ?

UPDATE :

The full code (I have removed the FROM clause to simplify) :

val spark = SparkSession.builder.appName("My Application").getOrCreate()
spark.sql("SELECT CAST(UNIX_TIMESTAMP('2015-03-29T01:11:23Z', \"yyyy-MM-dd'T'HH:mm:ss'Z'\") AS TIMESTAMP) as column1, CAST(UNIX_TIMESTAMP('2015-03-29T02:11:23Z', \"yyyy-MM-dd'T'HH:mm:ss'Z'\") AS TIMESTAMP) as column2").show(false)

The full trace :

2018-07-20 10:26:52 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-07-20 10:26:52 INFO  SparkContext:54 - Running Spark version 2.3.1
2018-07-20 10:26:52 INFO  SparkContext:54 - Submitted application: My Application
2018-07-20 10:26:52 INFO  SecurityManager:54 - Changing view acls to: XXX
2018-07-20 10:26:52 INFO  SecurityManager:54 - Changing modify acls to: XXX
2018-07-20 10:26:52 INFO  SecurityManager:54 - Changing view acls groups to:
2018-07-20 10:26:52 INFO  SecurityManager:54 - Changing modify acls groups to:
2018-07-20 10:26:52 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(XXX); groups with view permissions: Set(); users  with modify permissions: Set(XXX); groups with modify permissions: Set()
2018-07-20 10:26:52 INFO  Utils:54 - Successfully started service 'sparkDriver' on port 36615.
2018-07-20 10:26:52 INFO  SparkEnv:54 - Registering MapOutputTracker
2018-07-20 10:26:52 INFO  SparkEnv:54 - Registering BlockManagerMaster
2018-07-20 10:26:52 INFO  BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2018-07-20 10:26:52 INFO  BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
2018-07-20 10:26:52 INFO  DiskBlockManager:54 - Created local directory at /tmp/blockmgr-d8b77ac8-53a3-4923-810e-d81539b30369
2018-07-20 10:26:52 INFO  MemoryStore:54 - MemoryStore started with capacity 366.3 MB
2018-07-20 10:26:52 INFO  SparkEnv:54 - Registering OutputCommitCoordinator
2018-07-20 10:26:52 INFO  log:192 - Logging initialized @1074ms
2018-07-20 10:26:52 INFO  Server:346 - jetty-9.3.z-SNAPSHOT
2018-07-20 10:26:52 INFO  Server:414 - Started @1129ms
2018-07-20 10:26:52 INFO  AbstractConnector:278 - Started ServerConnector@7544a1e4{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2018-07-20 10:26:52 INFO  Utils:54 - Successfully started service 'SparkUI' on port 4040.
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1a6f5124{/jobs,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@781e7326{/jobs/json,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@22680f52{/jobs/job,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@39c11e6c{/jobs/job/json,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@324dcd31{/stages,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@503d56b5{/stages/json,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@72bca894{/stages/stage,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2575f671{/stages/stage/json,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@329a1243{/stages/pool,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@ecf9fb3{/stages/pool/json,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2d35442b{/storage,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@27f9e982{/storage/json,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4593ff34{/storage/rdd,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@37d3d232{/storage/rdd/json,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@30c0ccff{/environment,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@581d969c{/environment/json,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@22db8f4{/executors,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2b46a8c1{/executors/json,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1d572e62{/executors/threadDump,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@29caf222{/executors/threadDump/json,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@46cf05f7{/static,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7048f722{/,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@c074c0c{/api,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@58dea0a5{/jobs/job/kill,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2a2bb0eb{/stages/stage/kill,null,AVAILABLE,@Spark}
2018-07-20 10:26:52 INFO  SparkUI:54 - Bound SparkUI to 0.0.0.0, and started at http://XXX:4040
2018-07-20 10:26:52 INFO  SparkContext:54 - Added JAR file:XXX.jar at spark://XXX with timestamp 1532075212997
2018-07-20 10:26:53 INFO  Executor:54 - Starting executor ID driver on host localhost
2018-07-20 10:26:53 INFO  Utils:54 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 42981.
2018-07-20 10:26:53 INFO  NettyBlockTransferService:54 - Server created on XXX:42981
2018-07-20 10:26:53 INFO  BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
2018-07-20 10:26:53 INFO  BlockManagerMaster:54 - Registering BlockManager BlockManagerId(driver, XXX, 42981, None)
2018-07-20 10:26:53 INFO  BlockManagerMasterEndpoint:54 - Registering block manager XXX:42981 with 366.3 MB RAM, BlockManagerId(driver, XXX, 42981, None)
2018-07-20 10:26:53 INFO  BlockManagerMaster:54 - Registered BlockManager BlockManagerId(driver, XXX, 42981, None)
2018-07-20 10:26:53 INFO  BlockManager:54 - Initialized BlockManager: BlockManagerId(driver, XXX, 42981, None)
2018-07-20 10:26:53 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@66bfd864{/metrics/json,null,AVAILABLE,@Spark}
2018-07-20 10:26:53 INFO  SharedState:54 - Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:XXX').
2018-07-20 10:26:53 INFO  SharedState:54 - Warehouse path is 'file:XXX'.
2018-07-20 10:26:53 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2dbd803f{/SQL,null,AVAILABLE,@Spark}
2018-07-20 10:26:53 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3e48e859{/SQL/json,null,AVAILABLE,@Spark}
2018-07-20 10:26:53 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@22d1886d{/SQL/execution,null,AVAILABLE,@Spark}
2018-07-20 10:26:53 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7df60067{/SQL/execution/json,null,AVAILABLE,@Spark}
2018-07-20 10:26:53 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@748fe51d{/static/sql,null,AVAILABLE,@Spark}
2018-07-20 10:26:53 INFO  StateStoreCoordinatorRef:54 - Registered StateStoreCoordinator endpoint
2018-07-20 10:26:55 INFO  CodeGenerator:54 - Code generated in 238.939456 ms
2018-07-20 10:26:55 INFO  CodeGenerator:54 - Code generated in 14.895801 ms
2018-07-20 10:26:55 INFO  SparkContext:54 - Starting job: show at XXXTransform.scala:21
2018-07-20 10:26:55 INFO  DAGScheduler:54 - Got job 0 (show at XXXTransform.scala:21) with 1 output partitions
2018-07-20 10:26:55 INFO  DAGScheduler:54 - Final stage: ResultStage 0 (show at XXXTransform.scala:21)
2018-07-20 10:26:55 INFO  DAGScheduler:54 - Parents of final stage: List()
2018-07-20 10:26:55 INFO  DAGScheduler:54 - Missing parents: List()
2018-07-20 10:26:55 INFO  DAGScheduler:54 - Submitting ResultStage 0 (MapPartitionsRDD[4] at show at XXXTransform.scala:21), which has no missing parents
2018-07-20 10:26:55 INFO  MemoryStore:54 - Block broadcast_0 stored as values in memory (estimated size 7.3 KB, free 366.3 MB)
2018-07-20 10:26:55 INFO  MemoryStore:54 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.5 KB, free 366.3 MB)
2018-07-20 10:26:55 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on XXX:42981 (size: 3.5 KB, free: 366.3 MB)
2018-07-20 10:26:55 INFO  SparkContext:54 - Created broadcast 0 from broadcast at DAGScheduler.scala:1039
2018-07-20 10:26:55 INFO  DAGScheduler:54 - Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[4] at show at XXXTransform.scala:21) (first 15 tasks are for partitions Vector(0))
2018-07-20 10:26:55 INFO  TaskSchedulerImpl:54 - Adding task set 0.0 with 1 tasks
2018-07-20 10:26:55 INFO  TaskSetManager:54 - Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 8067 bytes)
2018-07-20 10:26:55 INFO  Executor:54 - Running task 0.0 in stage 0.0 (TID 0)
2018-07-20 10:26:55 INFO  Executor:54 - Fetching spark://XXX.jar with timestamp 1532075212997
2018-07-20 10:26:55 INFO  TransportClientFactory:267 - Successfully created connection to XXX/10.1.97.135:36615 after 23 ms (0 ms spent in bootstraps)
2018-07-20 10:26:55 INFO  Utils:54 - Fetching spark://XXX.jar to /tmp/spark-853e3e5d-e228-473d-be50-c641d2b678a4/userFiles-7f3d8d8b-e9d4-4d9c-ac52-38e7fe59de7d/fetchFileTemp7750137601738248500.tmp
2018-07-20 10:26:55 INFO  Executor:54 - Adding file:/tmp/spark-853e3e5d-e228-473d-be50-c641d2b678a4/userFiles-7f3d8d8b-e9d4-4d9c-ac52-38e7fe59de7d/XXX.jar to class loader
2018-07-20 10:26:55 INFO  CodeGenerator:54 - Code generated in 5.345898 ms
2018-07-20 10:26:55 INFO  Executor:54 - Finished task 0.0 in stage 0.0 (TID 0). 1139 bytes result sent to driver
2018-07-20 10:26:55 INFO  TaskSetManager:54 - Finished task 0.0 in stage 0.0 (TID 0) in 178 ms on localhost (executor driver) (1/1)
2018-07-20 10:26:55 INFO  TaskSchedulerImpl:54 - Removed TaskSet 0.0, whose tasks have all completed, from pool
2018-07-20 10:26:55 INFO  DAGScheduler:54 - ResultStage 0 (show at XXXTransform.scala:21) finished in 0,321 s
2018-07-20 10:26:55 INFO  DAGScheduler:54 - Job 0 finished: show at XXXTransform.scala:21, took 0,356248 s
+-------------------+-------+
|column1> > > |column2|
+-------------------+-------+
|2015-03-29 01:11:23|null   |
+-------------------+-------+
2018-07-20 10:26:55 INFO  SparkContext:54 - Invoking stop() from shutdown hook
2018-07-20 10:26:55 INFO  AbstractConnector:318 - Stopped Spark@7544a1e4{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2018-07-20 10:26:55 INFO  SparkUI:54 - Stopped Spark web UI at http://XXX:4040
2018-07-20 10:26:55 INFO  MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2018-07-20 10:26:55 INFO  MemoryStore:54 - MemoryStore cleared
2018-07-20 10:26:55 INFO  BlockManager:54 - BlockManager stopped
2018-07-20 10:26:55 INFO  BlockManagerMaster:54 - BlockManagerMaster stopped
2018-07-20 10:26:55 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2018-07-20 10:26:55 INFO  SparkContext:54 - Successfully stopped SparkContext
2018-07-20 10:26:55 INFO  ShutdownHookManager:54 - Shutdown hook called
2018-07-20 10:26:55 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-1b68790c-9242-480e-a0b4-1b9d7068b27a
2018-07-20 10:26:55 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-853e3e5d-e228-473d-be50-c641d2b678a4

Upvotes: 2

Views: 1245

Answers (1)

Bartosz Konieczny
Bartosz Konieczny

Reputation: 2033

I tried to reproduce the error and I succeeded. I've started my investigation by discovering what dates are problematic and after executing a small for loop I ended with:

val spark = SparkSession.builder.appName("My Application").getOrCreate()
for (hour <- 0 to 23) {
  println(s"checking ${hour}")
  val queryHour = if (hour < 10) {
    s"0${hour}"
  } else {
    hour
  }
  spark.sql("SELECT CAST(UNIX_TIMESTAMP('2015-03-29T01:11:23Z', \"yyyy-MM-dd'T'HH:mm:ss'Z'\") AS TIMESTAMP) as column1, " +
    "CAST(UNIX_TIMESTAMP('2015-03-29T"+queryHour+":11:23Z', \"yyyy-MM-dd'T'HH:mm:ss'Z'\") AS TIMESTAMP) as column2").show(false)
}

And it returned:

checking 0
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 00:11:23|
+-------------------+-------------------+

checking 1
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 01:11:23|
+-------------------+-------------------+

checking 2
+-------------------+-------+
|column1            |column2|
+-------------------+-------+
|2015-03-29 01:11:23|null   |
+-------------------+-------+

checking 3
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 03:11:23|
+-------------------+-------------------+

checking 4
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 04:11:23|
+-------------------+-------------------+

checking 5
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 05:11:23|
+-------------------+-------------------+

checking 6
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 06:11:23|
+-------------------+-------------------+

checking 7
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 07:11:23|
+-------------------+-------------------+

checking 8
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 08:11:23|
+-------------------+-------------------+

checking 9
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 09:11:23|
+-------------------+-------------------+

checking 10
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 10:11:23|
+-------------------+-------------------+

checking 11
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 11:11:23|
+-------------------+-------------------+

checking 12
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 12:11:23|
+-------------------+-------------------+

checking 13
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 13:11:23|
+-------------------+-------------------+

checking 14
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 14:11:23|
+-------------------+-------------------+

checking 15
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 15:11:23|
+-------------------+-------------------+

checking 16
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 16:11:23|
+-------------------+-------------------+

checking 17
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 17:11:23|
+-------------------+-------------------+

checking 18
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 18:11:23|
+-------------------+-------------------+

checking 19
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 19:11:23|
+-------------------+-------------------+

checking 20
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 20:11:23|
+-------------------+-------------------+

checking 21
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 21:11:23|
+-------------------+-------------------+

checking 22
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 22:11:23|
+-------------------+-------------------+

checking 23
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 01:11:23|2015-03-29 23:11:23|
+-------------------+-------------------+

Thus, only 2 o'clock was impacted. From there I dug into Spark's datetimeExpressions.scala file where the conversion for datetime stuff is made. More precisely, after adding some breakpoints, I found the problem was made by a silent parsing error in org.apache.spark.sql.catalyst.expressions.UnixTime#eval method, and more exactly here:

case StringType if right.foldable =>
    if (constFormat == null || formatter == null) {
        null
    } else {
        try {
          formatter.parse(t.asInstanceOf[UTF8String].toString).getTime / 1000L
        } catch {
            case NonFatal(_) => null
        }
    }

And the formatter is created with a time zone:

private lazy val formatter: DateFormat =
try {
  DateTimeUtils.newDateFormat(constFormat.toString, timeZone)
} catch {
  case NonFatal(_) => null
}

Next, I isolated the code introducing problem:

val formatterParseError: DateFormat = DateTimeUtils.newDateFormat("yyyy-MM-dd'T'HH:mm:ss'Z'", DateTimeUtils.getTimeZone("Europe/Paris"))
val formattedDateParseError = formatterParseError.parse("2015-03-29T02:11:23Z")

And effectively, I got a ParseException:

Unparseable date: "2015-03-29T02:11:23Z"
java.text.ParseException: Unparseable date: "2015-03-29T02:11:23Z"
at java.text.DateFormat.parse(DateFormat.java:366)
at   com.waitingforcode.stackoverflow.SchemaTest$$anonfun$2.apply$mcV$sp(SchemaTest.scala:225)
at com.waitingforcode.stackoverflow.SchemaTest$$anonfun$2.apply(SchemaTest.scala:218)
at com.waitingforcode.stackoverflow.SchemaTest$$anonfun$2.apply(SchemaTest.scala:218)
at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)

From that I checked time zone management in Javadoc https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html#timezone . It seems that when the time zone is expressed as 'Z' (ISO 8601 Time zone), the correct pattern option to use is X. And when I've changed it in my isolated test case, the date was correctly formatted:

val formatter: DateFormat = DateTimeUtils.newDateFormat("yyyy-MM-dd'T'HH:mm:ssX", DateTimeUtils.getTimeZone("Europe/Paris"))
val formattedDate = formatter.parse("2015-03-29T02:11:23Z")
println(s"formattedDate = ${formattedDate}")
// formattedDate = Sun Mar 29 04:11:23 CEST 2015

And it also worked for Spark's SQL that given (converted to Europe/Paris timezone, you change it in spark.sql.session.timeZone option):

checking 0
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 01:11:23|
+-------------------+-------------------+

checking 1
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 03:11:23|
+-------------------+-------------------+

checking 2
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 04:11:23|
+-------------------+-------------------+

checking 3
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 05:11:23|
+-------------------+-------------------+

checking 4
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 06:11:23|
+-------------------+-------------------+

checking 5
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 07:11:23|
+-------------------+-------------------+

checking 6
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 08:11:23|
+-------------------+-------------------+

checking 7
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 09:11:23|
+-------------------+-------------------+

checking 8
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 10:11:23|
+-------------------+-------------------+

checking 9
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 11:11:23|
+-------------------+-------------------+

checking 10
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 12:11:23|
+-------------------+-------------------+

checking 11
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 13:11:23|
+-------------------+-------------------+

checking 12
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 14:11:23|
+-------------------+-------------------+

checking 13
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 15:11:23|
+-------------------+-------------------+

checking 14
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 16:11:23|
+-------------------+-------------------+

checking 15
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 17:11:23|
+-------------------+-------------------+

checking 16
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 18:11:23|
+-------------------+-------------------+

checking 17
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 19:11:23|
+-------------------+-------------------+

checking 18
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 20:11:23|
+-------------------+-------------------+

checking 19
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 21:11:23|
+-------------------+-------------------+

checking 20
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 22:11:23|
+-------------------+-------------------+

checking 21
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-29 23:11:23|
+-------------------+-------------------+

checking 22
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-30 00:11:23|
+-------------------+-------------------+

checking 23
+-------------------+-------------------+
|column1            |column2            |
+-------------------+-------------------+
|2015-03-29 03:11:23|2015-03-30 01:11:23|
+-------------------+-------------------+

To recap, the null is returned in case of failure and we can learn it from:

And we can solve it by using ISO-like timezone option (X).

Upvotes: 4

Related Questions