mHelpMe
mHelpMe

Reputation: 6668

delete duplicate data which have different date entries keeping the earliest one

I have a table which has some incorrect data that I want to remove.

The logic is as follows. Where the code, name and source are equal (like the 1st three rows in the table below I want to keep the record with the earliest date, so 2017-01-02 in this case and then remove the other two rows. How to do this?

   DateEntry    Code    Name     Source
   2017-01-02   ABCF    Blah     SS
   2017-03-02   ABCF    Blah     SS
   2017-04-02   ABCF    Blah     SS       
   2017-03-08   BVFT    Andy     SS
   2017-04-14   BVFT    Andy     SS
   2017-04-20   BVFT    Blob     SS
   2017-04-28   BVFT    Blob     SS
   2017-05-14   BVFT    Blob     SS
   2017-06-02   BVFT    Blob     SS

What I want,

   DateEntry    Code    Name     Source
   2017-01-02   ABCF    Blah     SS  
   2017-03-08   BVFT    Andy     SS
   2017-04-20   BVFT    Blob     SS

Upvotes: 0

Views: 59

Answers (6)

Mahesh.K
Mahesh.K

Reputation: 901

Try with this

;with cte as (
SELECT *,ROW_NUMBER() OVER(PARTITION BY Code,Name,Source ORDER BY DateEntry DESC) AS rnum
FROM MyTable
)
delete from cte where rnum>1

Upvotes: 0

Asif Rahman
Asif Rahman

Reputation: 21

WITH t1 AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY Code,Name,Source ORDER BY DateEntry) AS   
'Rank',Code,Name,Source 
FROM TABLE
)
DELETE FROM t1 WHERE t1.Rank>1

Upvotes: 0

Ullas
Ullas

Reputation: 11556

Give a row number partitioned by the columns Code, Name and Source and order by ascending order of DateEntry column. Then delete the rows which having row number greater than 1.

Query

;with cte as(
    select [rn] = row_number() over(
        partition by Code, Name, Source
        order by DateEntry    
     ), *
     from your_table_name
)
delete from cte
where rn > 1;

Find a demo here

Upvotes: 1

Yogesh Sharma
Yogesh Sharma

Reputation: 50163

You should ROW_NUMBER() function By SubQuery Form :

SELECT DateEntry,
       Code,
       Name,
       Source
FROM
(
    SELECT *,
           ROW_NUMBER() OVER(PARTITION BY Code,
                                          Name,
                                          Source ORDER BY dateentry) rn
    FROM <table_name>
) a
WHERE rn = 1;

Result :

|  DateEntry | Code | Name | Source |
|------------|------|------|--------|
| 2017-01-02 | ABCF | Blah |     SS |
| 2017-03-08 | BVFT | Andy |     SS |
| 2017-04-20 | BVFT | Blob |     SS |

Upvotes: 1

user8768978
user8768978

Reputation:

You can use the ranking function ROW_NUMBER with PARTITION BY Code, Name:

WITH Ranked 
AS
(
   SELECT
      DateEntry, Code,Name, Source,
      ROW_NUMBER() OVER(PARTITION BY Code, Name ORDER BY DateEntry) AS RN
   FROM table1
)
SELECT DateEntry, Code,Name, Source
FROM Ranked
WHERE RN = 1;

The ranking function will generate a row number for each row, with the first in the group with 1, the one with the earliest date. Then the filter will remove all except the first row in each group.

Results:

|  DateEntry | Code | Name | Source |
|------------|------|------|--------|
| 2017-01-02 | ABCF | Blah |     SS |
| 2017-03-08 | BVFT | Andy |     SS |
| 2017-04-20 | BVFT | Blob |     SS |

Keep in mind that, this won't actually delete the other rows, this is just a select.

Upvotes: 0

Pete Carter
Pete Carter

Reputation: 2731

SELECT * FROM (
SELECT 
    *,
    ROW_NUMBER() OVER(PARTITION BY DateEntry ORDER BY DateEntry DESC) AS rn
FROM MyTable
) a
WHERE rn = 1

Upvotes: 0

Related Questions