Reputation: 187
I have a Document
table:
| ID| DOC_ID | Count|
1 22 1
2 22 1
3 22 2
4 22 2
5 22 2
6 22 3
Can I write a SQL query that will only return the latest rows in where a count change occurs? I'm using SQL Server
Desired output :
ID DOC_ID Count
---------------------
6 22 3
Upvotes: 1
Views: 1096
Reputation: 1269445
In SQL Server 2012+ you can just use lag()
:
select t.*
from (select t.*, lag(count) over (partition by doc_id order by id) as prev_count
from t
) t
where prev_count <> count;
This is more painful in SQL Server 2008. I would recommend apply
:
select t.*
from (select t.*, tprev.count as prev_count
from t cross apply
(select top (1) tprev.*
from t tprev
where tprev.doc_id = t.doc_id and tprev.id < t.id
order by tprev.id desc
) tprev
) t
where prev_count <> count;
The subquery is not necessary for this version. I just put it there so you can see the relationship between the two queries. This works just as well:
select t.*
from t cross apply
(select top (1) tprev.*
from t tprev
where tprev.doc_id = t.doc_id and tprev.id < t.id
order by tprev.id desc
) tprev
where tprev.count <> t.count;
I can also add that if you know that the counts never change back, then you can get the first id
for each one using aggregation:
select doc_id, count, min(id) as min(id)
from t
group by doc_id, count;
Under many circumstances, this will be faster than the apply
version (although the lag()
would often have the best performance).
Upvotes: 3