Reputation: 72652
My table looks like:
id
caseid
status
Example data in the table:
1 10 open
2 10 published
3 10 reopened
4 11 open
5 11 contact
6 11 response
7 11 published
I need to get all caseid
s where the last status was published
.
So in the above example only 11
would be retrieved (because the caseid 10 was reopenend later).
What would the query look like?
PS
I'm using PDO with prepared statements.
Upvotes: 1
Views: 106
Reputation: 656291
I'll add another query, because it is the fastest one without indexes - it only requires a single sequential table scan. Simpler and faster than @Florin's query.
SELECT caseid
FROM tbl
GROUP BY caseid
HAVING max(id) = max(CASE WHEN status = 'published' THEN id ELSE NULL END);
However, with proper indexes @Elroy's solution is a lot faster. I actually ran a quick benchmark with a similar real life table to verify this.
Proper indexes:
CREATE INDEX tbl_id_idx ON tbl (id);
And this partial index for optimal results:
CREATE INDEX tbl_status_published_idx ON tbl (id) WHERE status = 'published';
You don't need any other indexes for this query. In particular, an additional index on caseid
is of no use. Always consider that index maintenance carries a cost, too - especially if the table is heavily written.
This will probably change with covering indexes ("index-only scans") in PostgreSQL 9.2. I wrote more about that in a recent answer on dba.SE.
Try EXPLAIN ANALYZE to verify.
Upvotes: 1
Reputation: 121599
EDITED:
Sorry - brain fart the first time around :(
Is this better?
select a.caseid,a.status
from temp a
where a.id in
(select MAX(id) maxid from temp group by caseid)
and status='published'
Upvotes: 1
Reputation: 17643
select a.caseid
from
(select max(id) as id, caseid
from table
group by caseid)a --this would get the last entry for every caseid
inner join table b on (a.id=b.id)
where b.status = 'published'
Upvotes: 3
Reputation: 3218
select caseid from data d1 where d1.status = 'published'
and not exists (select * from data d2 where d1.caseid = d2. caseid
and d2.id > d1.id)
For best performance, the id and caseid columns should be indexed.
Upvotes: 4