Reputation: 513
Imagine a table of contacts, where the same contact has multiple entries, but with differing data. How would one go about selecting this data for review? Unfortunately, a merge of sorts would be disagreeable as there may exist visually identifiable erroneous data that is not currently envisaged to be automatically processed.
FName LName Email Phone
Heywood Yapinchme 555-555-555
Heywood Yapinchme [email protected]
Seymour Butz [email protected]
Seymour Butz 555-555-556
Seymour Butz
Hughe Jass [email protected] 555-555-557
Amanda Hugginkiss [email protected]
I would like to see just the duplicates of the first two columns where more than one entry exists. i.e.
FName LName Email Phone
Heywood Yapinchme 555-555-555
Heywood Yapinchme [email protected]
Seymour Butz [email protected]
Seymour Butz 555-555-556
Seymour Butz
The next step of review is in the blue ether. Currently a little over a million rows, Bart has been busy. But beefy servers and this isn't regular operation but a one off to deal with data migration, so can be slightly gash.
I have tried a bit with SELECT DISTINCT
and GROUP BY
but it seems to just return on of each contact.
Upvotes: 3
Views: 44
Reputation: 164089
You can use aggregation to identify the duplicates:
SELECT FName, LName
FROM tablename
GROUP BY FName, LName
HAVING COUNT(*) > 1
and if you want all the rows of the duplicates:
SELECT *
FROM tablename
WHERE (FName, LName) IN (
SELECT FName, LName
FROM tablename
GROUP BY FName, LName
HAVING COUNT(*) > 1
)
If your MySql/MariaDB version supports window functions:
SELECT t.*
FROM (
SELECT *, COUNT(*) OVER (PARTITION BY FName, LName) counter
FROM tablename
) t
WHERE t.counter > 1
Upvotes: 3