Reputation: 338
I have a table of places and their regions, but sometimes there is more than one place with the same name in the same region. Real life examples: There are five Springfields in Wisconsin. Nova Scotia, where I live, has three Sandy Coves. I need to find these sorts of records and either disambiguate them (by adding their county or equivalent, for example) or just delete junk/overlapping ones. That will take some work, but first I'd like to see how much work.
PlaceName: RegionName:
Summerville Big State
Summerville Bigger State (OK, different states, no problem...)
Summerville Little State <-
Summerville Little State <- I need to deal with these
This query, based on an answer from another question, gets me all of the places with the same name:
SELECT * FROM Places WHERE PlaceName IN
(SELECT PlaceName FROM Places GROUP BY PlaceName HAVING COUNT(*) > 1);
This is a good start, but I want to skip over the names that don't occur more than once in the same state and just get straight to the problem cases. To put it generically, I'd like to find non-unique records in one column and from there get the ones that are non-unique in another column.
(FWIW, I'm using MariaDB, which is mostly compatible with MySQL.)
Upvotes: 0
Views: 165
Reputation: 1270011
One way to do this is with a join
to the aggregated list. You need to aggregate by both region and place to get the list you want:
SELECT p.*, rp.cnt
FROM Places p join
(SELECT RegionName, PlaceName, COUNT(*) as cnt
FROM Places
GROUP BY RegionName, PlaceName
HAVING COUNT(*) > 1
) rp
on p.RegionName = rp.RegionName and p.PlaceName = rp.PlaceName;
You don't mention the actual database you are using. There are other ways to phrase this, some dependent on the database.
Upvotes: 1
Reputation: 65284
Basically you want to
So let's just do this
SELECT
PlaceName, RegionName, Count(*) AS num
FROM Places
GROUP BY CONCAT(PlaceName,':::',RegionName)
HAVING COUNT(*)>1
Upvotes: 1
Reputation: 17579
I am not sure but it seems to be just simple group by two fields
select PlaceName , RegionName
from Places
group by PlaceName , RegionName
having count(*) >1
Upvotes: 0
Reputation: 916
I think you can just concatenate columns:
SELECT * FROM Places WHERE PlaceName + RegionName IN
(SELECT PlaceName + RegionName FROM Places GROUP BY PlaceName + RegionName HAVING COUNT(*) > 1);
If I'm wrong, I'm sure other StackOverflowers will let me know!! :D
Upvotes: 0