pro_newbie
pro_newbie

Reputation: 336

Remove duplicate records from database table

I am using

mysql Ver 14.14 Distrib 5.5.35, for debian-linux-gnu (x86_64) using readline 6.2

I have a situation in my mysql database table, where I have got duplicate records, and the strange thing is that these records have an extra space in between there itemCode.

For example,

mysql> select * from tbl_item;
+--------+-----------------+---------------------------+
| itemId | itemCOde        | itemdescription           |
+--------+-----------------+---------------------------+
|      1 | 0001 1234567    | Test description data     |
|      2 | 0001  1234567   | Test description data     |
|      3 | 0120 21113      | dummy data                |
|      4 | 0120  21113     | dummy data                |
|      5 | 145 78953       | hello world               |
|      6 | 487 77787       | testing data              |
+--------+-----------------+---------------------------+

By running the following query, I am able to find all the duplicate records:

mysql> select itemId, itemCode, itemDescription, count(*) as count from tbl_item Group By itemDescription having count(*) > 1;
+--------+-----------------+---------------------------+-------+
| itemId | itemCode        | itemdescription           | count |
+--------+-----------------+---------------------------+-------+
|      1 | 0001 1234567    | Test description data     |   2   |
|      3 | 0120 21113      | dummy data                |   2   |
+--------+-----------------+---------------------------+-------+

But now, how I am suppose to delete the records which have and extra space in between the itemCode. Any suggestions?

mysql> desc tbl_item;
+-----------------+--------------+------+-----+---------+----------------+
| Field           | Type         | Null | Key | Default | Extra          |
+-----------------+--------------+------+-----+---------+----------------+
| itemID          | int(11)      | NO   | PRI | NULL    | auto_increment |
| itemCode        | varchar(50)  | NO   |     | NULL    |                |
| itemDescription | varchar(200) | NO   |     | NULL    |                |
+-----------------+--------------+------+-----+---------+----------------+

Upvotes: 0

Views: 82

Answers (1)

lopo
lopo

Reputation: 306

With this query you should find the items, which have another entry with same description and a shorter itemCode.

     SELECT tbl_item.* 
     FROM tbl_item
     INNER JOIN tbl_item AS t2
      ON t2.itemDescription = tbl_item.itemDescription
      AND LENGTH(t2.itemCode) < LENGTH(tbl_item.itemCode)

With this you can find all entries, which have no other one with the same description and a shorter itemCode. So all right entries. This result you can insert in a table-copy and after that rename.

     SELECT tbl_item.* 
     FROM tbl_item
     LEFT JOIN tbl_item AS t2
      ON t2.itemDescription = tbl_item.itemDescription
      AND LENGTH(t2.itemCode) < LENGTH(tbl_item.itemCode)
     WHERE t2.itemId IS NULL

Upvotes: 1

Related Questions