Reputation: 1530
I have a column Time
in my spark df. It is a string type. I need to convert it to datetime format. I have tried the following:
data.select(unix_timestamp(data.Time, 'yyyy/MM/dd HH:mm:ss').cast(TimestampType()).alias("timestamp"))
data.printSchema()
The output is:
root
|-- Time: string (nullable = true)
If I save it in a new df, then I am losing all of my other columns.
Upvotes: 1
Views: 2720
Reputation: 2091
You can use withColumn
instead of select
data = spark.createDataFrame([('1997/02/28 10:30:00',"test")], ['Time','Col_Test'])
df = data.withColumn("timestamp",unix_timestamp(data.Time, 'yyyy/MM/dd HH:mm:ss').cast(TimestampType()))
Output :
>>> df.show()
+-------------------+--------+-------------------+
| Time|Col_Test| timestamp|
+-------------------+--------+-------------------+
|1997/02/28 10:30:00| test|1997-02-28 10:30:00|
+-------------------+--------+-------------------+
>>> data.printSchema()
root
|-- Time: string (nullable = true)
|-- Col_Test: string (nullable = true)
>>> df.printSchema()
root
|-- Time: string (nullable = true)
|-- Col_Test: string (nullable = true)
|-- timestamp: timestamp (nullable = true)
Upvotes: 1