Anos
Anos

Reputation: 57

How to cast Date column from string to datetime in pyspark/python?

I have a date column with string datatype when inferred in pyspark:

Mon Oct 17 15:57:48 EST 2022

How to cast string datatype as datetime?

Upvotes: 0

Views: 662

Answers (1)

samkart
samkart

Reputation: 6654

you can use the required datetime formatters - 'E MMM dd HH:mm:ss z yyyy'. the resulting timestamp will be in UTC and, thus, you'll see that it will add 5 hours to the source ts.

spark.conf.set('spark.sql.legacy.timeParserPolicy', 'LEGACY')

spark.sparkContext.parallelize([('Mon Oct 17 15:57:48 EST 2022', )]).toDF(['dt_str']). \
    withColumn('dt', func.to_timestamp('dt_str', 'E MMM dd HH:mm:ss z yyyy')). \
    show(truncate=False)

# +----------------------------+-------------------+
# |dt_str                      |dt                 |
# +----------------------------+-------------------+
# |Mon Oct 17 15:57:48 EST 2022|2022-10-17 20:57:48|
# +----------------------------+-------------------+

Upvotes: 1

Related Questions