Reputation: 11433
I have a SQL table containing customer information (we'll call it CustomerDB
), including an address column. Many of the rows in this table have exact duplicate addresses. The business requirement is to retrieve rows from the database such that:
So, if I have a data set like this (I have also entered this in the Data Explorer, to make it easy to test queries against):
ID NAME COMPANY ADDR1
16242 TOM E Company A 101 First RD
16241 RONALD J Company B 12 Tenth AVE
16235 KENNETH H 12 Tenth AVE
16238 MICHAEL H Company C 12 Tenth AVE
16243 ANTONIO D Company D 264 Long ST STE 5
16237 MICHAEL B Company E 264 Long ST STE 5
16234 WALTER L Company F 73 North RD
16236 CARL O Company G 73 North RD
16239 MICHAEL S Company H 73 North RD
16240 MICHAEL I Company I 73 North RD
I would like to get all of those rows except for one of the "73 North RD
" records. I hope I'm making some sense here.
My mind does not work very well in set-based operations, so I'm confused about how to do this. I would prefer a solution with an explanation of why it works. Since I'd like to be "taught how to fish", so to speak =)
Additional Information:
ID
is an int primary key (auto-incremented)CROSS APPLY
or CTE
's)Upvotes: 3
Views: 311
Reputation: 57583
A possible solution could be
SELECT c1.id, c1.name, c1.company, c1.addr1
FROM CustomerDB c1 LEFT JOIN CustomerDB c2
ON (c1.addr1 = c2.addr1 and c1.id >= c2.id)
GROUP BY c1.addr1, c1.id, c1.name, c1.company
HAVING COUNT(*) <= 3
The magic (nice definition you gave) is done by GROUP BY ... HAVING ...
part.
Sure you know WHERE
is used to filter data while selecting; well, HAVING
is used in conjunction with GROUP BY
to filter grouped data (so after grouping).
So I first group data (order is important) by address and then limit the count for each group to three.
Now let's talk about the inner part (you didn't ask this, but I thought it was good to provide you a complete example).
The ON
part is used to tell the engine which fields are to be used to join tables; here I match tables on address and then I take records having an higher id (so, sorting ascending).
Upvotes: 3