Reputation: 779
here is my source code in a databricks notebook using python
data = [('2021-01-01','2021-01-02')]
schema1 = StructType([
StructField("date1", DateType(), True),
StructField("date2", DateType(), True)])
spark.createDataFrame(data,schema1).show()
however I got the following error
anyone has the idea ?
Upvotes: 3
Views: 5187
Reputation: 909
You tried to inject string type data into date type so you failed.
I see two solutions:
import datetime
data = [(
datetime.datetime.strptime('2021-01-01', "%Y-%m-%d").date(),
datetime.datetime.strptime('2021-01-02', "%Y-%m-%d").date()
)]
schema1 = StructType([
StructField("date1", DateType(), True),
StructField("date2", DateType(), True)])
df = spark.createDataFrame(data, schema1)
df.show()
# output:
+----------+----------+
| date1| date2|
+----------+----------+
|2021-01-01|2021-01-02|
+----------+----------+
from pyspark.sql import functions as F
data = [('2021-01-01','2021-01-02')]
df = spark.createDataFrame(data)
df = df.select(*(F.to_date(c) for c in df.columns))
df.show()
# oudput
+-----------+-----------+
|to_date(_1)|to_date(_2)|
+-----------+-----------+
| 2021-01-01| 2021-01-02|
+-----------+-----------+
Upvotes: 3