peterwimsey
peterwimsey

Reputation: 562

Group by column, then find changes in another column, then get last row

I have a table like this:

user_name   sent_at     data
---------------------------------------------
alpha       2015-11-01  { "status" : "pending", … }
beta        2015-11-01  { "status" : "pending", …}
gamma       2015-11-01  { "status" : "pending", … }
alpha       2015-11-02  { "status" : "complete", … }
alpha       2015-11-03  { "status" : "complete", … }

For every user I want to find the row where he reached his current status. So for user alpha that would be on 2015-11-02. Note that the same status can be repeated multiple times for the same user.

I ended up with:

SELECT *
FROM
(
    SELECT 
        user_name,
        sent_at,
        status,
        row_number() OVER (PARTITION BY user_name ORDER BY sent_at DESC) as row
    FROM
    (
        SELECT
            user_name,
            sent_at,
            data->>'status' AS status,
            lag(data->>'status') OVER (PARTITION BY user_name ORDER BY sent_at) AS prev_status
        FROM log
    ) AS t
    WHERE status IS DISTINCT FROM prev_status
) AS u 
WHERE row = 1

Is there a simpler way?

Upvotes: 1

Views: 50

Answers (1)

klin
klin

Reputation: 121604

Your query looks ok imho. If you want to avoid window functions, you can use distinct on twice ordered by sent_at in opposite directions:

select distinct on (user_name) 
    l1.user_name, 
    l1.sent_at, 
    status
from log l1
join (
    select distinct on (user_name) 
        user_name, 
        sent_at, 
        data->>'status' status
    from log
    order by 1, 2 desc
    ) l2
on l1.user_name = l2.user_name and l1.data->>'status' = l2.status
order by 1, 2;

Upvotes: 1

Related Questions