P3P0
P3P0

Reputation: 165

Conver int YYYYMMDD to date pyspark

I'm trying to convert an INT column to a date column in Databricks with Pyspark. The column looks like this:

Report_Date
20210102
20210102
20210106
20210103
20210104

I'm trying with CAST function

df = df.withColumn("Report_Date", col("Report_Date").cast(DateType()))

but I'm getting the below Error:

Cannot resolve 'CAST(`Report_Date` AS DATE)' due to data type mismatch: cannot cast int to date;

Do you know how can I get the expected output?

Upvotes: 0

Views: 16636

Answers (1)

mck
mck

Reputation: 42332

Cast to string type first, then use to_date:

import pyspark.sql.functions as F

df2 = df.withColumn(
    "Report_Date", 
    F.to_date(F.col("Report_Date").cast("string"), "yyyyMMdd")
)

Upvotes: 10

Related Questions