Quix
Quix

Reputation: 13

Cleanup Update for Duplicate Entries

I have a table that contains user information (UserInfo), and some users may be in the table twice under different UserIds, like so:

UserId       LastName       FirstName       active
000001       Jetson         George          1
000002       Flintstone     Fred            0
000003       Jetson         George          1
000004       Flintstone     Fred            1
000005       Barbara        Hannah          1

Only 1 row per user should show as active, but a bug in the application is causing there to be two user entries with unique UserIds that have active set to 1.

I then have another table that contains records associated with a UserId (UserRecords). In the case of duplicate active UserIds, only one of the two UserIds will return any results in the second table. Thus:

SELECT 
((SELECT count(*)
FROM UserRecords recs
where recs.UserId= inf.UserId)) as Records, *
FROM UserInfo inf
where inf.lastname = 'Jetson' and
inf.active='1' and
inf.firstname='George'

might return:

Records       UserId       LastName       FirstName       active
0             000001       Jetson         George          1
1273          000003       Jetson         George          1

I want to create an update statement to change the active column to 0 for any entries where active = '1' and Records='0', but there are hundreds of users that have duplicate entries, and there could be more later. So I would like to create an update statement that would find these and set them to 0 automatically.

What I have is

update UserId
set active = '0'
where (SELECT count(*)
FROM UserRecords recs
where recs.UserId= inf.UserId) = 0

The problem with that statement is that it does not take into account duplicate users. There could be users who simply have no records in UserRecords (yet) but that also do not have duplicate entries. Setting those to 0 would cause system problems.

So, how can I change my update statement to only mark active as 0 where there are duplicate entries?

Any help would be greatly appreciated.

Thanks!

Upvotes: 1

Views: 53

Answers (2)

Jenn
Jenn

Reputation: 795

Okay, so here's what I would recommend. You want to change your where clause to specify only the duplicates. Also, you really only want to look at the active records because it doesn't matter if there are duplicates of inactive records.

In order to see if there is a duplicate, you can use an exists. In order to use the exists, first we are going to write a subquery to pull back duplicate records, aka anything with the same first and last name, a different id, and is also active. If the subquery pulls something back, exists will return true and we will update the record. If there are no duplicates, the subquery will not grab any records and the exists will return false. Then, we will not update the record.

update u
set active = 0
From UserInfo u
where (SELECT count(*)
FROM UserRecords recs
where recs.UserId= u.UserId) = 0
and     u.active = 1
and     exists (Select 1
                From UserInfo u2
                Where u2.lastname = u.lastname
                and     u2.firstname = u.firstname
                and     u2.userid <> u.userid
                and     u2.active = 1)

Upvotes: 1

Barkermn01
Barkermn01

Reputation: 6842

insted of deactivating them you can remove them the only problem i see is what happens if you have 2 users called the same thing but to remove them insted of changin active you can

ALTER IGNORE TABLE `UserInfo` ADD UNIQUE INDEX idx_urows (last_name, first_name);

This will also cause MySQL to error when you try to add a duplicate i would say you should have an email address or something unique to a user in there as names are not unique.

Taken From Remove duplicate rows in MySQL

Upvotes: 0

Related Questions