MySQL Updating duplicate records

Question

I'm not even sure this is possible in a query, but here goes. I'm cleaning up data coming into our system from a 3rd party. They're creating a not-so-unique unique string for each person in a group that consists of the first 3 letters of the first name and then the dob in yyyymmdd format. On the few that are duplicates, I want to append -1, -2, -3 etc. to that ID so they truly are unique.

JOH19700101     0001    JOHN SMITH    1970-01-01
JOH19700101     0001    JOHNSON SMITH 1970-01-01
JOH19700101     0002    JOHN SMITH    1970-01-01

In the above, the first two IDs (JOH19700101) need to have -1 and -2 appended because they are in the same group (0001). The 3rd can remain the same as he's in a different group (0002).

Selecting the duplicates was easy enough:

SELECT x.id, x.uid, x.group, x.name, x.dob
FROM import x
INNER JOIN (
    SELECT uid
    FROM import
    GROUP BY group, uid
    HAVING COUNT(uid) > 1) z ON x.uid = z.uid;

Is there a way to add an update to this so that the first two samples would get new UIDs of JOH19700101-1 and JOH19700101-2 respectively?

Simon Woolf · Accepted Answer

This will give you unique uid values although it could benefit from one small refinement. Right now the number that is appended keeps incrementing, so your first pair of dupes will get -1 and -2 appended to their uids, then the next pair -3 and -4, and so on.

It should be possible to reset @inc somehow, however as it stands this will ensure that all of your uid values are unique, and that identical uids in different groups will remain unchanged.

SET @inc :=0;

UPDATE 
 `import` i
       JOIN (
   SELECT uid, `group`
   FROM `import`
      GROUP BY `group`, uid
   HAVING COUNT(uid) > 1) d ON 
  i.uid = d.uid AND 
  i.group = d.group

SET
  i.uid = CONCAT(i.uid, '-',@inc:=@inc+1)

MySQL Updating duplicate records

Answers (1)

Related Questions