Qks
Qks

Reputation: 25

first_value and last_value for each user id

I am trying to get the first scheduled start time and last scheduled end time for each user in my scheduled_jobs table for a date.

I can get it to work for one user, but when trying for all users per day I can get the last time, but the cannot get the correct first time it is displaying the time for the first user_id for all user_id's.

Here is my code:

SELECT DISTINCT on (user_id)
    user_id, first_value(scheduled_jobs.at) over (order by user_id, scheduled_jobs.at ASC),
    last_value(scheduled_jobs.to) over (order by user_id, scheduled_jobs.at DESC)
FROM scheduled_jobs
WHERE scheduled_jobs.at between CURRENT_DATE+INTERVAL'3 day' and CURRENT_DATE +INTERVAL '4 day'

Example of current results:

user_id | first_value         | last_value
  19    | 2018-10-29 07:00:00 | 2018-10-29 17:00:00
  30    | 2018-10-29 07:00:00 | 2018-10-29 15:00:00
  37    | 2018-10-29 07:00:00 | 2018-10-29 16:30:00 

Last_value is showing correctly for each user_id, however first_value is always showing the value for the first user_id for all.

I have tried spliting them into different SELECT queries with a JOIN and a USING query but still getting incorrect results for the first_value.

Upvotes: 1

Views: 393

Answers (2)

user330315
user330315

Reputation:

Why not simply use min() and max()? As you don't select any additional columns, there is no need for distinct on() or window functions to begin with:

SELECT user_id, 
       min(scheduled_jobs.at),
       max(scheduled_jobs.at)
FROM scheduled_jobs
WHERE scheduled_jobs.at between CURRENT_DATE + 3 and CURRENT_DATE + 4
group by user_id;

When you want to add days to a DATE value, you don't need use an interval, you can just add an integer

Upvotes: 1

S-Man
S-Man

Reputation: 23676

You need a PARTITION BY clause, which generates the frame per user_id

SELECT DISTINCT on (user_id)
    user_id, 
    first_value(sj.at) OVER (PARTITION BY user_id ORDER BY sj.at ASC),
    last_value(sj.to) OVER (PARTITION BY user_id ORDER BY sj.at DESC)
FROM 
    scheduled_jobs sj
WHERE 
    sj.at BETWEEN CURRENT_DATE + 3 and CURRENT_DATE + 4

Addionally: Please be careful by using last_value. Sometimes it would not work as expected. See here

You should use first_value with DESC ordering instead:

first_value(scheduled_jobs.at) over (partition by user_id order by scheduled_jobs.at DESC)

Upvotes: 2

Related Questions