Reputation: 51
I am trying to create dataframe using sample record. One of the field is of DateType. I am getting error for value provided in DatType field. Please find below code Error is
TypeError: field date: DateType can not accept object '2019-12-01' in type <class 'str'>
I tried to convert stringType to DateType using to_date plus some other ways but not able to do so. Please advise
from pyspark.sql.functions import to_date,col,lit,expr
from pyspark.sql.types import StructType,StructField,IntegerType,DateType,StringType
from pyspark.sql import Row
MySchema = StructType([ StructField("CustomerID",IntegerType(),True),
StructField("Quantity",IntegerType(),True),
StructField("date",DateType(),True)
])
myRow=Row(10,100,"2019-12-01")
mydf=spark.createDataFrame([myRow],MySchema)
display(mydf)
Upvotes: 5
Views: 9973
Reputation: 473
What works for me (I'm on Python 3.8.12 and Spark version 3.0.1):
from datetime import datetime
from pyspark.sql.types import DateType, StructType, StructField,
IntegerType, Row
from pyspark.sql import SparkSession
MySchema = StructType([ StructField("CustomerID",IntegerType(),True),
StructField("Quantity",IntegerType(),True),
StructField("date",DateType(),True)
])
spark = SparkSession.builder.appName("local").master("local").getOrCreate()
myRow=Row(10,100,datetime(2019, 12, 1))
mydf=spark.createDataFrame([myRow],MySchema)
mydf.show(truncate=False) #I'm not on DataBricks, so I use mydf.show(truncate=False) instead of display
Upvotes: 1
Reputation: 943
You can use datetime
class to convert string to date:
from datetime import datetime
myRow=Row(10,100,datetime.strptime('2019-12-01','%Y-%m-%d'))
mydf=spark.createDataFrame([myRow],MySchema)
mydf.show()
It should work.
Upvotes: 5