Reputation: 493
I want to use spark SQL or pyspark to reformat a date field from 'dd/mm/yyyy' to 'yyyy/mm/dd'. The field type is string:
from pyspark.sql import SparkSession
from pyspark.sql.types import StringType
spark = SparkSession.builder.master("local[1]")\
.appName("date.com")\
.getOrCreate()
my_df = spark.createDataFrame(["13/04/2020", "16/04/2020", "19/04/2020"], StringType()).toDF("date")
expected_df = spark.createDataFrame(["2020/04/12", "2020/04/16", "2020/04/19"], StringType()).toDF("date")
I have tried the following spark sql command, but this returns the date as literally 'yyyy/MM/dd' rather than '2020/04/12'.
select date_format(date, 'dd/MM/yyyy'), 'yyyy/MM/dd' as reformatted_date
FROM my_df
I have also looked at the following documentation but didn't see anything that fits my scenario: https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html
If it's not possible in spark sql then pyspark would work. Any ideas?
Upvotes: 1
Views: 5584
Reputation: 1
df1.select( to_date(date_format(to_date(lit("12/12/2020"), "dd/MM/yyyy"), "yyyy-MM-dd") ).as("campo")).show()
Upvotes: 0
Reputation: 42392
You need to convert to date type using to_date
first:
select date_format(to_date(date, 'dd/MM/yyyy'), 'yyyy/MM/dd') as reformatted_date
from my_df
Upvotes: 3