Reputation: 23749
In the following Hiring_date
is of DateType
. df2 fills the null dates as '1900-01-01'. But the date format of actual data is mm/dd/yyyy
. Hence I want the null values to be filled as 01/01/1900
. So, I tried the code shown in second block below; but the Hiring_date
column still showed null values as NULL.
Question: What I may be missing and how can we fix it? I guess more important part of the question would be: Why Code 2 is ignoring the 01/01/1900
allotter
Code 1: Fills the null date values as '1900-01-01'. But I need the 01/01/1900
format
df1 = df..withColumn("Hiring_date", df.Hiring_date.cast(DateType()))
df2 = df1.fillna( {'Hiring_date': '1900-01-01'} )
Code 2: Fills the null date values as NULL. But I need it to display 01/01/1900
df1 = df..withColumn("Hiring_date", df.Hiring_date.cast(DateType()))
df2 = df1.fillna( {'Hiring_date': '01/01/1900'} )
Upvotes: 0
Views: 1243
Reputation: 16147
I believe the only valid string format for inputting DateType is yyyy-MM-dd
which explains why your first code is working.
You seem to be wanting a string representation of the date, which you can achieve with:
df.withColumn('Hiring_date', date_format(col('Hiring_date'), 'MM/dd/yyyy'))
Upvotes: 1