untitled
untitled

Reputation: 1335

Amazon Redshift : Fetch latest row query

saletable :

saleID | date | orderstatus | .....

I want to fetch only latest details for each saleID. I can do it using nested queries like

Select * from saletable t1 where date in ( select max(date) from saletable t2 where t1.saleID = t2.saleID ) 

Is it possible to do it with a simple query ? If so, any hint ?

Upvotes: 5

Views: 10738

Answers (1)

Patrick
Patrick

Reputation: 32234

You can use a common table expression to do this efficiently:

WITH ld AS (
  SELECT saleID, max("date") AS latest FROM saletable GROUP BY saleID)
SELECT s.*
FROM saletable s
JOIN ld ON ld.saleID = s.saleID
WHERE s."date" = ld.latest;

As commented by DogBoneBlues: This has the advantage over the original method as there are only 2 scans of the data (one is aggregated and the other is filtered, both of which a columnar DB like Redshift will do very efficiently). With the original approach, a query would be reissued for each row of the data resulting in an O(n2) operation.

Upvotes: 6

Related Questions