Reputation: 5670
I have a Spark DataFrame as shown below:
#Create DataFrame
df <- data.frame(name = c("Thomas", "William", "Bill", "John"),
dates = c('2017-01-05', '2017-02-23', '2017-03-16', '2017-04-08'))
df <- createDataFrame(df)
#Make sure df$dates column is in 'date' format
df <- withColumn(df, 'dates', cast(df$dates, 'date'))
name | dates
--------------------
Thomas |2017-01-05
William |2017-02-23
Bill |2017-03-16
John |2017-04-08
I want to change dates
to the end of month date, so they would look like shown below. How do I do this? Either SparkR or PySpark code is fine.
name | dates
--------------------
Thomas |2017-01-31
William |2017-02-28
Bill |2017-03-31
John |2017-04-30
Upvotes: 7
Views: 27642
Reputation: 19348
last_day
is a poorly named function and should be wrapped in something more descriptive to make the code easier to read.
endOfMonth
is a better function name. Here's how to use this function with the Scala API. Suppose you have the following data:
+----------+
| some_date|
+----------+
|2016-09-10|
|2020-01-01|
|2016-01-10|
| null|
+----------+
Run the endOfMonth
function that's part of spark-daria:
import com.github.mrpowers.spark.daria.sql.functions._
df.withColumn("res", endOfMonth(col("some_date"))).show()
Here are the results:
+----------+----------+
| some_date| res|
+----------+----------+
|2016-09-10|2016-09-30|
|2020-01-01|2020-01-31|
|2016-01-10|2016-01-31|
| null| null|
+----------+----------+
I'll try to add this function to quinn as well so there is an easily accessible function for PySpark users as well.
Upvotes: 3
Reputation: 422
You may use the following (PySpark):
from pyspark.sql.functions import last_day
df.select('name', last_day(df.dates).alias('dates')).show()
To clarify, last_day(date)
returns the last day of the month of which date belongs to.
I'm pretty sure there is a similar function in sparkR https://spark.apache.org/docs/1.6.2/api/R/last_day.html
Upvotes: 16
Reputation: 5670
For completeness, here is the SparkR code:
df <- withColumn(df, 'dates', last_day(df$dates))
Upvotes: 1