lifeson
lifeson

Reputation: 171

MYSQL remove duplicates where multiple columns match

I have a table > 500,000 rows where there are duplicate entries that need to be removed but only where a few columns match.

The main table has the following columns

id,
countryID,
postalCode,
adminName1,
adminName2,
placeName,
adminName3,
latitude,
longitude

I need to remove duplicates (leaving the first record) where placeName, latitude & longitude match

I had a search and found this which looks right but doesnt work for me. I have duplicated the original table structure into a new table (tblTemp)

INSERT INTO tblTemp(id,countryID,postalCode,adminName1,adminName2,placeName,adminName3,latitude,longitude)
SELECT DISTINCT placeName,latitude,longitude
FROM tblCountry_admin;

But i get the error

Column count doesn't match value count at row 1

Upvotes: 0

Views: 1787

Answers (3)

ScaisEdge
ScaisEdge

Reputation: 133410

if you want delete the duplicate row by laceName, latitude & longitude leaving the the old one you could

You could check for select the duplicated rows this wat

  select * from tblCountry_admin
  where id not in (
     select min(id)
     from tblCountry_admin
     group by placename, latitude, longitude
     )

then you could delete this way

delete from tblCountry_admin
where id not in (
   select min(id)
   from tblCountry_admin
   group by placename, latitude, longitude
   )

the error you get in you insert select id due by the fact the number of column in insert don't match the number of column is select

Upvotes: 1

Gurwinder Singh
Gurwinder Singh

Reputation: 39537

Assuming the tblTemp has same set of columns as tblCountry_admin and you want to get the rows with max id in case of duplicates, you can use this:

INSERT INTO tblTemp
select a.*
from tblCountry_admin a left join tblCountry_admin b on a.placeName = b.placeName
    and a.latitude = b.latitude
    and a.longitude = b.longitude
    and a.id < b.id
where b.id is null;

If you want to create the table using the select use:

create table tblTemp as
select a.*
from tblCountry_admin a left join tblCountry_admin b on a.placeName = b.placeName
    and a.latitude = b.latitude
    and a.longitude = b.longitude
    and a.id < b.id
where b.id is null;

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1271171

Use WHERE:

INSERT INTO tblTemp (id, countryID, postalCode, adminName1,adminName2,     
                     placeName, adminName3, latitude, longitude)
    SELECT id, countryID, postalCode, adminName1, adminName2,
           placeName, adminName3, latitude, longitude
    FROM tblCountry_admin a
    WHERE a.id = (SELECT MIN(a2.id)
                  FROM tblCountry_admin a2
                  WHERE a2.placeName = a.placeName AND
                        a2.latitude = a.latitude AND
                        a2.longitude = a.longitude
                 );

Upvotes: 0

Related Questions