Joeme
Joeme

Reputation: 513

MYSQL distinct column row pair query

Imagine a table of contacts, where the same contact has multiple entries, but with differing data. How would one go about selecting this data for review? Unfortunately, a merge of sorts would be disagreeable as there may exist visually identifiable erroneous data that is not currently envisaged to be automatically processed.

FName         LName           Email           Phone
Heywood       Yapinchme                       555-555-555
Heywood       Yapinchme       [email protected]               
Seymour       Butz            [email protected]   
Seymour       Butz                            555-555-556
Seymour       Butz            
Hughe         Jass            [email protected]     555-555-557
Amanda        Hugginkiss      [email protected]

I would like to see just the duplicates of the first two columns where more than one entry exists. i.e.

FName         LName           Email           Phone
Heywood       Yapinchme                       555-555-555
Heywood       Yapinchme       [email protected]               
Seymour       Butz            [email protected]   
Seymour       Butz                            555-555-556
Seymour       Butz            

The next step of review is in the blue ether. Currently a little over a million rows, Bart has been busy. But beefy servers and this isn't regular operation but a one off to deal with data migration, so can be slightly gash.

I have tried a bit with SELECT DISTINCT and GROUP BY but it seems to just return on of each contact.

Upvotes: 3

Views: 44

Answers (1)

forpas
forpas

Reputation: 164089

You can use aggregation to identify the duplicates:

SELECT FName, LName
FROM tablename
GROUP BY FName, LName
HAVING COUNT(*) > 1

and if you want all the rows of the duplicates:

SELECT *
FROM tablename
WHERE (FName, LName) IN (
  SELECT FName, LName
  FROM tablename
  GROUP BY FName, LName
  HAVING COUNT(*) > 1
)

If your MySql/MariaDB version supports window functions:

SELECT t.*
FROM (
  SELECT *, COUNT(*) OVER (PARTITION BY FName, LName) counter
  FROM tablename
) t
WHERE t.counter > 1

Upvotes: 3

Related Questions