Reputation: 181
I have a column with a timestamp from which I'm extracting the last day of each month using last_day()
, but the function gives me only the date part of the timestamp, and when I use to_timestamp()
within it, it zeroes out the time values which in this case are very relevant because that determines which values comes has higher precedence.
Is there a way to get the last day of the month of a timestamp in a timestamp format without losing the time values?
Currently, I have a regular SQL query using SparkSQL as follows:
spark.sql("SELECT DISTINCT max(col1) AS col1, \
col2, col3,\
max(col4) AS col4,\
last_day(col5_date) as col5_date\
FROM table \
GROUP BY col1, col2,\
col3, col4, col5").registerTempTable("table1")
The above gives me the following result:
Col1 | col2 | col3 | col4 | col5 |
---|---|---|---|---|
val1 | val2 | val3 | val4 | 2022-11-30 |
And when I tried the to_timestamp
function on the Last_day()
function this is what I get:
Col1 | col2 | col3 | col4 | col5 |
---|---|---|---|---|
val1 | val2 | val3 | val4 | 2022-12-31 00:00:00 |
However, the data does have the time data in it, like this: 2022-10-28 12:32:...
. Is there any way I can preserve the time in the timestamp when retrieving the last day of the month?
Upvotes: 0
Views: 505
Reputation: 31540
Method-1:
Try with make_timestamp()
function and pass year+month+date+hours+minutes+secs
to the function.
Explanation:
make_timestamp(year(last_day(ts)), -- get year from last_day()
month(last_day(ts)),-- get month from last_day()
day(last_day(ts)),-- get day from last_day()
hour(ts),-- get hour from timestamp column
minute(ts),-- get minute from timestamp column
second(ts))-- get second from timestamp column
Example:
with cte as (
select current_timestamp() as ts
) select *,make_timestamp(year(last_day(ts)),month(last_day(ts)),day(last_day(ts)),hour(ts),minute(ts),second(ts)) as col5 from cte
#ts col5
#2023-05-19T22:12:26.027+0000 2023-05-31T22:12:26.000+0000
Method-2:
With using combination of to_timestamp + last_day + date_format
functions we are going to get the HH:mm:ss from the timestamp and add it to last_day()
function.
with cte as (
select current_timestamp() as ts
) select *,to_timestamp(concat_ws(" ",last_day(ts),date_format(ts,"HH:mm:ss"))) as col5 from cte
#ts col5
#2023-05-19T22:18:08.710+0000 2023-05-31T22:18:08.000+0000
Upvotes: 1