Reputation: 302
I have a PySpark dataframe with 'Week_of_the_year' column. '202005' means 5th week of year 2020. How can I convert it to 'date' format, maybe convert to mid-date (Wednesday) of that week?
Example: I want '202005' to show as '2020-01-29'.
Upvotes: 3
Views: 3830
Reputation: 8410
You can use to_date
function on your date with 3(day of week: Wednesday)
concatenated, like 2020053
, where 2020 is year, 05 is week of year, 3 is week day number. Refer to Java Simple Date format for info on date time chars.
from pyspark.sql import functions as F
df.withColumn("new_date", F.to_date(F.concat("old_date",F.lit("3")), "yyyywwu")).show()
#+--------+----------+
#|old_date| new_date|
#+--------+----------+
#| 202005|2020-01-29|
#+--------+----------+
Upvotes: 6