hdx
hdx

Reputation: 4548

Case insensitive duplicates SQL

So I have a users table where the user.username has many duplicates like:

username and Username and useRnAme
john and John and jOhn

That was a bug and these three records should have been only one.

I'm trying to come up with a SQL query that lists all of these cases ordered by their creation date, so ideally the result should be something like this:

username jan01
useRnAme jan02
Username jan03
john     feb01 
John     feb02
jOhn     feb03

Any suggestions will be much appreciated

Upvotes: 21

Views: 31069

Answers (6)

Badmous
Badmous

Reputation: 135

so this is what i came up with. this was written against a postgres db but should work fine still against other sql engine.

select * from user u join user u2
on upper(u.email)=upper(u2.email) where u.id != u2.id
order by u.email;

so the query assume that the email are duplicate but the ids are not so it is looking to pull records with a duplicate email (case insensitive) but with unique id

Upvotes: 0

ThiliKarunarathne
ThiliKarunarathne

Reputation: 101

SELECT UserName, CreatedDate
FROM YourTable 
WHERE UserName COLLATE UTF8_BIN != LOWER(UserName COLLATE UTF8_BIN)
GROUP BY UserName, CreatedDate
HAVING COUNT(*) > 1

Upvotes: 0

Larry Lustig
Larry Lustig

Reputation: 50970

Leaving aside the issue of case sensitivity for a moment, the basic strategy is:

 SELECT username, create_date FROM your_table
     WHERE username IN 
     (SELECT username FROM your_table GROUP BY username HAVING COUNT(*) > 1)
 ORDER BY username, create_date

Many RDBMSes (including MySQL assuming that you are using CHAR or VARCHAR for the username column), perform case-insensitive searching by default. For those databases, the above solution will work. To solve the case sensitivity issue for other products , wrap all except the first occurrence of username in the uppercase conversion function specific to your RDBMS:

 SELECT username, create_date FROM your_table
     WHERE UPPER(username) IN 
     (SELECT UPPER(username) FROM your_table GROUP BY UPPER(username) HAVING COUNT(*) > 1)
 ORDER BY username, create_date

Upvotes: 49

Christoph
Christoph

Reputation: 4401

Try something like these

SELECT UserName, CreatedDate
FROM User
WHERE LOWER(TRIM(UserName)) IN 
(
SELECT LOWER(TRIM(UserName))
FROM User
GROUP BY LOWER(TRIM(UserName))
HAVING count(*) > 1
)

Upvotes: 3

Andomar
Andomar

Reputation: 238086

In MySQL, a case-sensitive compare is done using a binary collation. So you could join the table on itself, looking for rows where the case sensitive compare is different from the case insensitive compare:

select *
from YourTable t1
inner join YourTable t2 
on t1.name <> t2.name collate latin1_bin
and t1.name = t2.name

Upvotes: 0

3Dave
3Dave

Reputation: 29051

Use ToLower() or equivalent function in your SELECT, and order by that column.

Upvotes: 0

Related Questions