ApeWithPants
ApeWithPants

Reputation: 104

Dedupe and retain record with most recent timestamp

I'm working with a wide dataset with 500+ columns. The dataset contains a customer ID field and a time-stamp field. I'd like to query the data and end up with a table with only one row per customer ID field where the row retained is the row with the most recent timestamp. The query will be run on a Netezza server if that makes a difference. It seems like I could do this with a sub-query, but I can't seem to get syntax that works.

Upvotes: 1

Views: 330

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270873

Here is a typical way to approach this problem:

select t.*
from table t
where not exists (select 1
                  from table t2
                  where t2.customerid = t.customerid and
                        t2.timestamp > t.timestamp
                 );

This rephrases the question to: "Get me all rows from the table where there is no row with the same customer id and a larger timestamp."

Upvotes: 4

Related Questions