Reputation: 165
I'm trying to convert an INT column to a date column in Databricks with Pyspark. The column looks like this:
Report_Date
20210102
20210102
20210106
20210103
20210104
I'm trying with CAST function
df = df.withColumn("Report_Date", col("Report_Date").cast(DateType()))
but I'm getting the below Error:
Cannot resolve 'CAST(`Report_Date` AS DATE)' due to data type mismatch: cannot cast int to date;
Do you know how can I get the expected output?
Upvotes: 0
Views: 16636
Reputation: 42332
Cast to string type first, then use to_date
:
import pyspark.sql.functions as F
df2 = df.withColumn(
"Report_Date",
F.to_date(F.col("Report_Date").cast("string"), "yyyyMMdd")
)
Upvotes: 10