Reputation: 335
I have a scenario where I have a column data like "Tuesday, 09-Aug-11 21:13:26 GMT"
and I want to create a schema in Spark but the datatypes TimestampType
and DateType
is not able to recognize this date format.
After loading the data to a dataframe using TimestampType
or DateType
I am seeing NULL
values in that particular column.
Is there any alternative for this?
Upvotes: 0
Views: 253
Reputation: 10372
One option is to read "Tuesday, 09-Aug-11 21:13:26 GMT" as string type column & do transformation from string to timestamp something like below.
df.show(truncate=false)
+-------------------------------+
|dt |
+-------------------------------+
|Tuesday, 09-Aug-11 21:13:26 GMT|
+-------------------------------+
df.withColumn("dt",to_timestamp(col("dt"),"E, d-MMM-y H:m:s z")).show(truncate=false) //Note - It is converted GMT to IST local timezone.
+-------------------+
|dt |
+-------------------+
|2011-08-10 02:43:26|
+-------------------+
Upvotes: 3