Adam
Adam

Reputation: 20922

MySQL Select Distinct one Column from Many

 INSERT INTO geo (city, country, latitude, longitude)
 SELECT ansiname, country_code, latitude, longitude
 FROM   geonames2
 WHERE country_code='HK';

The following statement works fine except the 'ansiname' column has duplicates in it.

Meaning there can be a few city names with the same name but different lat/long.

 HK Kowloon 25.6225     56.2225
 HK Kowloon 24.80388    56.19449

Is there a way to alter this query so it will only select (and therefore insert) one instance of each ansiname? Therefore dropping the second (with diff lat/long).

I updated the SELECT as follows unsuccessfully:

 SELECT DISTINCT(ansiname), country_code, latitude, longitude

thx

Upvotes: 1

Views: 2319

Answers (2)

Michael Fredrickson
Michael Fredrickson

Reputation: 37398

You can use the GROUP BY clause to accomplish this:

INSERT INTO geo (city, country, latitude, longitude)
SELECT ansiname, country_code, MAX(latitude), MAX(longitude)
FROM geonames2
WHERE country_code='HK'
GROUP BY ansiname, country_code

NOTE: By default, MySql doesn't require you to put the latitude and longitude into an aggregate function (MAX) if it isn't listed in the GROUP BY, but this setting is configurable and can vary from server to server... so it is a good practice to include the aggregate functions to avoid any potential errors.


EDIT: The issue with my suggest approach above is that it splits the lat / long. Since you have an ID column in your table, you could select a single ID for every ansiname, country_code combination, and then pull the results for that ID:

INSERT INTO geo (city, country, latitude, longitude)
SELECT ansiname, country_code, latitude, longitude
FROM geonames2
WHERE Id IN (
    SELECT MAX(Id) 
    FROM geonames2
    WHERE country_code='HK' 
    GROUP BY ansiname, country_code)

Upvotes: 2

Churk
Churk

Reputation: 4637

INSERT INTO geo (city, country, latitude, longitude)
SELECT ansiname, country_code, latitude, longitude
FROM geonames2
WHERE id IN (select id FROM table WHERE country_code='HK' GROUP BY geonames2, country_code )

The subselect should return you ID of the first instance of a name / country_code pair

Upvotes: 2

Related Questions