Reputation: 7659
This question is similar to but different from other SO questions involving removal of duplicate rows in MySQL.
Chosen solutions in the referenced questions(below) fail where one or more of the columns contain NULL values. I'm including schema as well as two chosen answers in a sqlfiddle below, to illustrate where the previous solutions are not working.
Source schema:
CREATE TABLE IF NOT EXISTS `deldup` (
`or_id` int(11) NOT NULL AUTO_INCREMENT,
`order_id` varchar(5) NOT NULL,
`txt_value` varchar(20) NOT NULL,
`date_of_revision` date NOT NULL,
`status` int(3) DEFAULT NULL,
PRIMARY KEY (`or_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=16 ;
INSERT INTO `deldup` (`or_id`, `order_id`, `txt_value`, `date_of_revision`, `status`) VALUES
(1, '10001', 'ABC', '2003-03-06', NULL),
(2, '10001', 'RFE', '2003-03-11', NULL),
(3, '10002', 'ASE', '2009-08-05', NULL),
(4, '10003', 'PEF', '2001-11-03', NULL),
(5, '10004', 'OIU', '1999-10-29', NULL),
(6, '10005', 'FOO', '2002-03-01', NULL),
(7, '10006', 'RTY', '2005-08-19', NULL),
(8, '10001', 'NND', '2003-03-20', NULL),
(9, '10005', 'VBN', '2002-02-19', NULL),
(10, '10002', 'AAQ', '2009-08-13', NULL),
(11, '10002', 'EEW', '2009-08-07', NULL),
(12, '10001', 'ABC', '2003-03-06', 3),
(13, '10001', 'ABC', '2003-03-06', 3),
(14, '10001', 'ABC', '2003-03-06', NULL),
(15, '10001', 'ABC', '2003-03-06', NULL);
Solution Example 1:
http://sqlfiddle.com/#!2/983f3/1
create temporary table tmpTable (or_id int);
insert tmpTable
(or_id)
select or_id
from deldup yt
where exists
(
select *
from deldup yt2
where yt2.txt_value = yt.txt_value
and yt2.order_id = yt.order_id
and yt2.date_of_revision = yt.date_of_revision
and yt2.status = yt.status
and yt2.or_id > yt.or_id
);
delete
from deldup
where or_id in (select or_id from tmpTable);
Note that the rows containing non-null row values are successfully deleted, however rows containing null values are not removed (See row 14 and 15 in the resulting SELECT query).
Solution Example 2:
http://sqlfiddle.com/#!2/8a4f8/1
DELETE
n1
FROM
deldup n1,
deldup n2
WHERE
n1.or_id < n2.or_id AND
n1.order_id = n2.order_id AND
n1.txt_value = n2.txt_value AND
n1.date_of_revision = n2.date_of_revision AND
n1.status = n2.status
This solution involves less code and works just the same as Example 1, including the exclusion of rows containing null values.
How can I remove the duplicate rows where one of the column values contains NULL values?
References:
Delete all Duplicate Rows except for One in MySQL?
Remove duplicate rows in MySQL
How to remove duplicate entries from a mysql db?
Upvotes: 3
Views: 1443