Reputation: 5453
I have a table named employee, there are many records in this table. Here is some sample data :
fullname | address | city
-----------------------------
AA address1 City1
AA address3 City1
AA address8 City2
BB address5 City2
BB address2 City1
CC address6 City1
CC address7 City2
DD address4 City1
I want to have a SELECT
query in sql server which will show only the duplicate records based on the columns fullname
and city
. For the given data and considering the condition, only the first two records is duplicate. So my expected output should be like below :
fullname | address | city
-----------------------------
AA address1 City1
AA address3 City1
To get this output, I have this query :
select fullname, city from employee group by fullname, city having count(*)>1
As you can see, it selects two columns only and thus it is giving the following output :
fullname | city
------------------
AA City1
If I re-write the query like below :
select fullname, city, address from employee group by fullname, city, address
having count(*)>1
Unfortunately it is showing no records! Can anybody please help me to write the correct query to get the expected result?
Upvotes: 13
Views: 16474
Reputation: 109
SELECT Feild1, Feild2, COUNT() FROM table name GROUP BY Feild1, Feild2 HAVING COUNT()>1
This will give you all yours answer
Upvotes: -1
Reputation: 136
Agree with above answer. But If you don't want to use Windows functions which might not work properly on all DBs you can join to itself on city and full name after the group by and having and then get the addresses
Select employee.* from employee
join (select fullname, city from employee group by fullname, city having count(*)>1) q1
on q1.fullname = employee.fullname and q1.city = employee.city
Upvotes: 2
Reputation: 334
Here you go with the solution:
DECLARE @Employee TABLE
(
Fullname VARCHAR(25),
[Address] VARCHAR(25),
City VARCHAR(25)
)
INSERT INTO @Employee VALUES
('AA', 'address1', 'City1')
,('AA', 'address1', 'City1')
,('AA', 'address3', 'City1')
,('AA', 'address8', 'City2')
,('BB', 'address5', 'City2')
,('BB', 'address2', 'City1')
,('CC', 'address6', 'City1')
,('CC', 'address7', 'City2')
;WITH cte AS (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY FullName, [Address], [City] ORDER BY Fullname) AS sl,
HashBytes('MD5', FullName + [Address] + [City]) AS RecordId
FROM @Employee AS e
)
SELECT c.FullName,
c.[Address],
c.City
FROM cte AS c
INNER JOIN cte AS c1
ON c.RecordId = c1.RecordId
WHERE c.sl = 2
Result :
FullName Address City
AA address1 City1
AA address1 City1
Upvotes: 0
Reputation: 1269463
If you have a unique id or the address is always different, you can try:
select e.*
from employee e
where exists (select 1
from employee e2
where e2.fullname = e.fullname and e2.city = e.city and
e2.address <> e.address -- or id or some other unique column
);
Although I would probably go with the window function approach, you might find that under some circumstances, this is faster (especially if you have an index on employee(fullname, city, address)
).
Upvotes: 1
Reputation: 21
Try the Following Code:
create table ##Employee
(Fullname varchar(25),
Address varchar(25),
City varchar(25))
insert into ##Employee values
( 'AA', 'address1', 'City1')
,( 'AA', 'address3', 'City1')
,( 'AA', 'address8', 'City2')
,( 'BB', 'address5', 'City2')
,( 'BB', 'address2', 'City1')
,( 'CC', 'address6', 'City1')
,( 'CC', 'address7', 'City2')
select E.* from ##Employee E
cross apply(
select Fullname,City,count(Fullname) cnt from ##Employee
group by Fullname,City
having Count(Fullname)>1)x
where E.Fullname=x.Fullname
and E.City=x.City
Upvotes: 1
Reputation: 452977
Instead of a grouped COUNT
you can use it as a windowed aggregate to access the other columns
SELECT fullname,
address,
city
FROM (SELECT *,
COUNT(*) OVER (PARTITION BY fullname, city) AS cnt
FROM employee) e
WHERE cnt > 1
Upvotes: 19