Reputation: 3576
I need to transform this given date format: 2019-10-22 00:00:00
to this one: 2019-10-22T00:00:00.000Z
I know this could be done in some DB via:
In AWS Redshift, you can achieve this using the following:
TO_DATE('{RUN_DATE_YYYY/MM/DD}', 'YYYY/MM/DD') || 'T00:00:00.000Z' AS VERSION_TIME
But my platform is Spark SQL, so neither above two work for me, the best I could get is using this:
concat(d2.VERSION_TIME, 'T00:00:00.000Z') as VERSION_TIME
which is a bit hacky, but still not completely correct, with this, I got this date format: 2019-10-25 00:00:00T00:00:00.000Z
,
but this part 00:00:00
in the middle of the string is redundant and I cannot leave it there.
Anyone has any insight here would be greatly appreciated!
Upvotes: 9
Views: 38802
Reputation: 13541
This is the natural way I think.
spark.sql("""SELECT date_format(to_timestamp("2019-10-22 00:00:00", "yyyy-MM-dd HH:mm:ss"), "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'") as date""").show(false)
The result is:
+------------------------+
|date |
+------------------------+
|2019-10-22T00:00:00.000Z|
+------------------------+
Upvotes: 10
Reputation: 2451
Maybe something like this? It's a bit different approach.
scala> val df = spark.range(1).select(current_date.as("date"))
scala> df.show()
+----------+
| date|
+----------+
|2019-11-09|
+----------+
scala>
df.withColumn("formatted",
concat(
regexp_replace(date_format('date,"yyyy-MM-dd\tHH:mm:ss.SSS"),"\t","T"),
lit("Z")
)
).show(false)
+----------+------------------------+
|date |formatted |
+----------+------------------------+
|2019-11-09|2019-11-09T00:00:00.000Z|
+----------+------------------------+
Upvotes: 3