Reputation: 1604
I keep getting NULLs with every single different solution I found related to this question
spark.sql('select DateColumn from table')
yields the following value
2021-07-26T22:12:10.3990989Z
I would like to convert this string into a datetime value without milliseconds as such
2021-07-26 22:12:10 PM
How can I perform this operation either in the sql statement or outside of it ?
Upvotes: 1
Views: 663
Reputation: 391
Maybe this would help -
ts = '2021-07-26T22:12:10.3990989Z'
query = f"""
SELECT MAKE_TIMESTAMP (
YEAR ('{ts}'),
MONTH('{ts}'),
DAY('{ts}'),
MINUTE('{ts}'),
SECOND('{ts}'),
0
) as ts """
spark.sql(query).show()
Results -
+-------------------+
| ts|
+-------------------+
|2021-07-26 12:10:00|
+-------------------+
0 would be the timezone, I think running it on your machine would yield desired results.
Upvotes: 0
Reputation: 31540
Try with from_unixtime
,unix_timestamp
functions.
Example:
df.show()
#+----------------------------+
#|ts |
#+----------------------------+
#|2021-07-26T22:12:10.3990989Z|
#+----------------------------+
df.withColumn("ts1",from_unixtime(unix_timestamp(col("ts"),"yyyy-MM-dd'T'HH:mm:ss.SSSSSSS'Z'"),"yyyy-MM-dd HH:mm:ss a")).\
show(10,False)
#+----------------------------+----------------------+
#|ts |ts1 |
#+----------------------------+----------------------+
#|2021-07-26T22:12:10.3990989Z|2021-07-26 22:12:10 PM|
#+----------------------------+----------------------+
UPDATE:
df.withColumn("ts1",from_unixtime(unix_timestamp(col("ts"),"yyyy-MM-dd'T'HH:mm:ss.SSSSSSS'Z'"),"yyyy-MM-dd HH:mm:ss").cast("timestamp")).\
printSchema()
#root
# |-- ts: string (nullable = true)
# |-- ts1: timestamp (nullable = true)
Upvotes: 1