Reputation: 104
I'm working with a wide dataset with 500+ columns. The dataset contains a customer ID field and a time-stamp field. I'd like to query the data and end up with a table with only one row per customer ID field where the row retained is the row with the most recent timestamp. The query will be run on a Netezza server if that makes a difference. It seems like I could do this with a sub-query, but I can't seem to get syntax that works.
Upvotes: 1
Views: 330
Reputation: 1270873
Here is a typical way to approach this problem:
select t.*
from table t
where not exists (select 1
from table t2
where t2.customerid = t.customerid and
t2.timestamp > t.timestamp
);
This rephrases the question to: "Get me all rows from the table where there is no row with the same customer id and a larger timestamp."
Upvotes: 4