Reputation: 37
How can I convert timestamp as string to timestamp in "yyyy-mm-ddThh:mm:ss.sssZ" format using PySpark?
Input timestamp (string), df:
| col_string |
| :-------------------- |
| 5/15/2022 2:11:06 AM |
Desired output (timestamp), df:
| col_timestamp |
| :---------------------- |
| 2022-05-15T2:11:06.000Z |
Upvotes: 0
Views: 91
Reputation: 24498
to_timestamp
can be used providing the optional format
parameter.
from pyspark.sql import functions as F
df = spark.createDataFrame([("5/15/2022 2:11:06 AM",)], ["col_string"])
df = df.select(F.to_timestamp("col_string", "M/dd/yyyy h:mm:ss a").alias("col_ts"))
df.show()
# +-------------------+
# | col_ts|
# +-------------------+
# |2022-05-15 02:11:06|
# +-------------------+
df.printSchema()
# root
# |-- col_ts: timestamp (nullable = true)
Upvotes: 2