Neil
Neil

Reputation: 31

Calculate previous year end date from current date

I have a bus_date column. which has multiple records with different date i.e 2021-03-15, 2021-05-12, 2021-01-15 etc.

I want to calculate previous year end for all given dates. my expected output is 2020-12-31 for all three dates.

However, I can use function date_sub(start_date, num_days).

but I don't want to manually pass num_days. since there are million of rows with diff dates. Can we write a view from a table or create dataframe, which will calculate previous year end?

Upvotes: 1

Views: 3210

Answers (1)

过过招
过过招

Reputation: 4234

You can use date_add and date_trunc to achieve this.

import pyspark.sql.functions as F

......
data = [
    ('2021-03-15',),
    ('2021-05-12',),
    ('2021-01-15',)
]
df = spark.createDataFrame(data, ['bus_date'])
df = df.withColumn('pre_year_end', F.date_add(F.date_trunc('yyyy', 'bus_date'), -1))
df.show()

Upvotes: 2

Related Questions