martian_rover
martian_rover

Reputation: 341

Pyspark: Convert Column from String Numbers to Timestamp Type

I am looking to convert Numbers in string format to timestamp.

Sample:

+--------+-------------------+
|date_str|expected date      |
+--------+-------------------+
|1.63E+15|1991-11-25 13:39:00|
|1.63E+15|1991-11-25 13:40:00|
|1.63E+15|1991-11-25 13:41:00|
+--------+-------------------+

I have tried using to_timestamp() but returning Null values. also converted to unix_timestamp but no luck.

sdf1.select(F.to_timestamp(sdf1.date_str, 'yyyy/MM/dd HH:mm:ss').alias('date')).show()
sdf1.select(to_timestamp('date_str', 'dd/MM/yyyy HH:mm:ss')).show()

Upvotes: 0

Views: 890

Answers (1)

Steven
Steven

Reputation: 15318

First of all, the transformation that you ask is not possible. The same input cannot produce different outputs. It should be :

+--------+-------------------+
|date_str|expected date      |
+--------+-------------------+
|1.63E+15|1991-11-25 13:39:00|  # Same "expected date" at each line
|1.63E+15|1991-11-25 13:39:00|
|1.63E+15|1991-11-25 13:39:00|
+--------+-------------------+

Then, the function you need is probably from_unixtime - converting a timestamp in numeric format to timestamp in string/timestamp format :

time_df = spark.createDataFrame([(1428476400,)], ['unix_time'])

time_df.select(from_unixtime('unix_time').alias('ts')).collect()
# [Row(ts='2015-04-08 00:00:00')]

The only probleme is that, in your example, your numbers are 16-digits long (E+15) which is too much precision for a unix timestamp. You should probably divide it by 1000000.

from pyspark.sql import functions as F, types as T

df.withColumn(
    "date_num", F.col("date_str").cast(T.DecimalType(16, 0)) / 1000000
).withColumn(
    "date", F.from_unixtime("date_num")
).show()
+--------+-------------------+-------------------+
|date_str|           date_num|               date|
+--------+-------------------+-------------------+
|1.63E+15|1630000000.00000000|2021-08-26 17:46:40|
+--------+-------------------+-------------------+

Upvotes: 2

Related Questions