ak548
ak548

Reputation: 9

SQL Teradata - mark duplicate records in the column

So I know how to identify a dup row but now also need to identify the row linked to it and mark as duplicate. Ex:

Row     Name    ID  State   Date    Dup
---------------------------------------
001     Jim     001 NJ      jan2020 
002     Jim     001 NJ      jan2020
003     Tan     002 NY      feb2020
004     Allen   003 CA      Feb2020

Output should like:

Row     Name    ID  State   Date    Dup
---------------------------------------
001     Jim     001 NJ      jan2020 Y
002     Jim     001 NJ      jan2020 Y
003     Tan     002 NY      feb2020 N
004     Allen   003 CA      Feb2020 N

I can use partition using row_number but it will not flag the record 001 as Y. What could be an approach?

Upvotes: 0

Views: 515

Answers (1)

ravioli
ravioli

Reputation: 3833

If you have a small number of columns, you can do something like this:

SELECT Row, Name, ID, State, Date, 
  CASE 
    WHEN COUNT(*) OVER(PARTITION BY Name, ID, State, Date) > 1 THEN 'Y' 
    ELSE 'N' 
  END AS Dup
FROM MyTable

This marks a given row as a duplicate based on the columns specified in the PARTITION BY expression. Also, be careful with your column names (i.e. Row, Date), as they may be reserved words.

Upvotes: 1

Related Questions