Reputation: 15577

Delete older from a duplicate select

I have been working on a query to search and delete duplicate column values. Currently I have this query (returns duplicates):

SELECT NUIP, FECHA_REGISTRO
FROM registros_civiles_nacimiento
WHERE NUIP IN (
 SELECT NUIP
 FROM registros_civiles_nacimiento
 GROUP BY NUIP
 HAVING (COUNT(NUIP) > 1)
) order by NUIP

This work returning a table like this:

NUIP        FECHA_REGISTRO
38120100138 1975-05-30
38120100138 1977-08-31
40051800275 1980-09-24
40051800275 1999-11-29
42110700118 1972-10-26
42110700118 1982-04-22
44030700535 1982-10-19
44030700535 1993-05-05
46072300777 1991-01-17
46072300777 1979-03-30

The thing is that I need to delete the rows with duplicate column values. But I need to delete the row with the oldest date, for example, for the given result, once the needed query is performed, this is the list of result that must be kept:

NUIP        FECHA_REGISTRO
38120100138 1977-08-31
40051800275 1999-11-29
42110700118 1982-04-22
44030700535 1993-05-05
46072300777 1991-01-17

How can I do this using plain SQL?

Upvotes: 1

Answers (3)

user1166147

Reputation: 1610

--PULL YOUR SELECT OF RECS WITH DUPES INTO A TEMP TABLE 
--(OR CREATE A NEW TABLE SO THAT YOU CAN KEEP THEM AROUND FOR LATER IN CASE)
SELECT   NUIP,FECHA_REGISTRO
INTO #NUIP 
FROM     SO_NUIP
WHERE NUIP IN (
SELECT NUIP
 FROM SO_NUIP
 GROUP BY NUIP
 HAVING (COUNT(NUIP) > 1)
)

--CREATE FLAG FOR DETERMINIG DUPES
ALTER TABLE #NUIP ADD DUPLICATETOREMOVE bit

 --USE `RANK()` TO SET FLAG
 UPDATE #NUIP
 SET DUPLICATETOREMOVE = CASE X.RANK
        WHEN  1 THEN 1
        ELSE 0 
        END
--SELECT *
FROM #NUIP A
INNER JOIN (SELECT NUIP,FECHA_REGISTRO,RANK() OVER (PARTITION BY [NUIP] ORDER BY    FECHA_REGISTRO ASC) AS RANK
FROM #NUIP) X ON X.NUIP = A.NUIP AND X.FECHA_REGISTRO = A.FECHA_REGISTRO

--HERE IS YOUR DELETE LIST
SELECT * 
FROM so_registros_civiles_nacimiento R
JOIN #NUIP N ON N.NUIP = R.NUIP AND N.FECHA_REGISTRO = R.FECHA_REGISTRO
WHERE N.DUPLICATETOREMOVE = 1

--HERE IS YOUR KEEP LIST
SELECT * 
FROM so_registros_civiles_nacimiento R
JOIN #NUIP N ON N.NUIP = R.NUIP AND N.FECHA_REGISTRO = R.FECHA_REGISTRO
WHERE N.DUPLICATETOREMOVE = 0

--ZAP THEM AND COMMIT YOUR TRANSACTION, YOU'VE STILL GOT A REC OF THE DELETEDS FOR AS LONG AS THE SCOPE OF YOUR #NUIP
BEGIN TRAN --COMMIT  --ROLLBACK
DELETE FROM so_registros_civiles_nacimiento
JOIN #NUIP N ON N.NUIP = R.NUIP AND N.FECHA_REGISTRO = R.FECHA_REGISTRO
WHERE N.DUPLICATETOREMOVE = 1

Upvotes: 1

PaddyC

Reputation: 576

Use RANK() to create the result set ordered by date
Use WHERE EXISTS to delete from the source.

(Note: if you run the rank function over your duplicates, you should get your results. I've just referred to the whole table below)

This statement works in Oracle (replace the select * with delete if it works for you:

SELECT * 
FROM registros_civiles_nacimiento ALL_ 
WHERE EXISTS 
    (SELECT * FROM    
        (SELECT * FROM 
            (SELECT  NUIP, 
                     FECHA_REGISTRO, 
                     RANK() OVER (PARTITION BY NUIP ORDER BY FECHA_REGISTRO) AS ORDER_
             FROM registros_civiles_nacimiento)
         WHERE ORDER_ = 1) OLDEST
     WHERE   ALL_.NUIP = OLDEST.NUIP
     AND   ALL_.FECHA_REGISTRO = OLDEST.FECHA_REGISTRO);

Upvotes: 0

Lamak

Reputation: 70668

You can use analytical functions for this:

;WITH CTE AS
(
    SELECT *, ROW_NUMBER() OVER(PARTITION BY NUIP ORDER BY FECHA_REGISTRO DESC) RN
    FROM registros_civiles_nacimiento
)
DELETE FROM CTE
WHERE RN > 1;

Upvotes: 0

Delete older from a duplicate select

Answers (3)

Related Questions