Reputation: 12845
There is a table with document revisions and authors. Looks like this:
doc_id rev_id rev_date editor title,content so on.... 123 1 2016-01-01 03:20 Bill ...... 123 2 2016-01-01 03:40 Bill 123 3 2016-01-01 03:50 Bill 123 4 2016-01-01 04:10 Bill 123 5 2016-01-01 08:40 Alice 123 6 2016-01-01 08:41 Alice 123 7 2016-01-01 09:00 Bill 123 8 2016-01-01 10:40 Cate 942 9 2016-01-01 11:10 Alice 942 10 2016-01-01 11:15 Bill 942 15 2016-01-01 11:17 Bill
I need to find out moments when document was transferred to another editor - only first rows of every edition series.
Like so:
doc_id rev_id rev_date editor title,content so on.... 123 1 2016-01-01 03:20 Bill ...... 123 5 2016-01-01 08:40 Alice 123 7 2016-01-01 09:00 Bill 123 8 2016-01-01 10:40 Cate 942 9 2016-01-01 11:10 Alice 942 10 2016-01-01 11:15 Bill
If I use DISTINCT ON (doc_id, editor) it resorts a table and I see only one per doc and editor, that is incorrect. Of course I can dump all and filter with shell tools like awk | sort | uniq. But it is not good for big tables.
Window functions like FIRST_ROW do not give much, because I cannot partition by doc_id, editor not to mess all them.
How to do better?
Thank you.
Upvotes: 0
Views: 28
Reputation: 1269513
You can use lag()
to get the previous value, and then a simple comparison:
select t.*
from (select t.*,
lag(editor) over (partition by doc_id order by rev_date) as prev_editor
from t
) t
where prev_editor is null or prev_editor <> editor;
Upvotes: 1