Marbo
Marbo

Reputation: 96

PySpark: String to timestamp transformation

I am working with time data and try to convert the string to timestamp format.

Here is what the 'Time' column looks like

+----------+
|  Time    |
+----------+
|1358380800|
|1380672000|
+----------+

Here is what I want

+---------------+
|  Time         |
+---------------+
|2013/1/17 8:0:0|
|2013/10/2 8:0:0|
+---------------+

I find some similar questions and answers and have tried these code, but all end with 'null'

df2 = df.withColumn("Time", test["Time"].cast(TimestampType()))

df2 = df.withColumn('Time', F.unix_timestamp('Time', 'yyyy-MM-dd').cast(TimestampType()))

Upvotes: 0

Views: 2757

Answers (1)

cronoik
cronoik

Reputation: 19550

Well your are doing it the other way around. The sql function unix_timestamp converts a string with the given format to a unix timestamp. When you want to convert a unix timestamp to the datetime format, you have to use the from_unixtime sql function:

from pyspark.sql import functions as F
from pyspark.sql import types as T
l1 = [('1358380800',),('1380672000',)]
df = spark.createDataFrame(l1,['Time'])
df.withColumn('Time', F.from_unixtime(df.Time).cast(T.TimestampType())).show()

Output:

+-------------------+ 
|               Time|
+-------------------+
|2013-01-17 01:00:00|
|2013-10-02 02:00:00|
+-------------------+

Upvotes: 1

Related Questions