pbnyc
pbnyc

Reputation: 49

How can I delete row(s) with duplicate row_id but different date field values in a single table?

mysql - How can I delete row(s) with duplicate 'row_id' but different 'recd_date' field values in a single table, leaving only the row with the most recent'recd_date'?

Tried script below...

DELETE FROM table1 
WHERE table1.id = table1.id 
AND table1.recd_date < table1.recd_date;

...but it deletes all records in the table, which logically, I can understand. But I feel like there should be a pretty simple and straight forward solution to this.

The images below are examples of what exists and the desired results..

example of existing table data

doc_id recd_date  recd_amount update_date
    72 2015-01-01        6000 2015-04-30
    72 2015-01-01        6000 2015-02-30
    12 2013-02-01        5000 2015-08-31
    12 2013-02-01        5000 2014-08-31
    12 2013-02-01        5000 2013-06-30

example of desired result

doc_id recd_date  recd_amount update_date
    72 2015-01-01        6000 2015-04-30
    12 2013-02-01        5000 2015-08-31

Upvotes: 1

Views: 43

Answers (1)

Strawberry
Strawberry

Reputation: 33935

DROP TABLE IF EXISTS my_table;

CREATE TABLE my_table
(doc_id INT NOT NULL
,recd_date DATE NOT NULL
,recd_amount INT NOT NULL
,update_date DATE NOT NULL
,PRIMARY KEY(doc_id,update_date)
);

INSERT INTO my_table VALUES
(72,'2015-01-01',6000,'2015-04-30'),
(72,'2015-01-01',6000,'2015-02-28'),
(12,'2013-02-01',5000,'2015-08-31'),
(12,'2013-02-01',5000,'2014-08-31'),
(12,'2013-02-01',5000,'2013-06-30');

SELECT * 
  FROM my_table x 
  LEFT 
  JOIN  
     ( SELECT doc_id
            , MAX(update_date) max_update_date 
         FROM my_table 
        GROUP 
           BY doc_id
     ) y 
    ON y.doc_id = x.doc_id 
   AND y.max_update_date = x.update_date;

+--------+------------+-------------+-------------+--------+-----------------+
| doc_id | recd_date  | recd_amount | update_date | doc_id | max_update_date |
+--------+------------+-------------+-------------+--------+-----------------+
|     12 | 2013-02-01 |        5000 | 2013-06-30  |   NULL | NULL            |
|     12 | 2013-02-01 |        5000 | 2014-08-31  |   NULL | NULL            |
|     12 | 2013-02-01 |        5000 | 2015-08-31  |     12 | 2015-08-31      |
|     72 | 2015-01-01 |        6000 | 2015-02-28  |   NULL | NULL            |
|     72 | 2015-01-01 |        6000 | 2015-04-30  |     72 | 2015-04-30      |
+--------+------------+-------------+-------------+--------+-----------------+

DELETE x
  FROM my_table x 
  LEFT 
  JOIN  
     ( SELECT doc_id
            , MAX(update_date) max_update_date 
         FROM my_table 
        GROUP 
           BY doc_id
     ) y 
    ON y.doc_id = x.doc_id 
   AND y.max_update_date = x.update_date
 WHERE y.doc_id IS NULL;
Query OK, 3 rows affected (0.00 sec)

SELECT * FROM my_table;
+--------+------------+-------------+-------------+
| doc_id | recd_date  | recd_amount | update_date |
+--------+------------+-------------+-------------+
|     12 | 2013-02-01 |        5000 | 2015-08-31  |
|     72 | 2015-01-01 |        6000 | 2015-04-30  |
+--------+------------+-------------+-------------+

http://sqlfiddle.com/#!9/fca88/1

Upvotes: 1

Related Questions