gymcode
gymcode

Reputation: 4623

SELECT Full Row with Duplicated Data in One Column

I am able to list emails which are duplicated.

But I am unable to list the entire row, which contain other fields such as EmployeeId, Username, FirstName, LastName, etc.

SELECT
  Email, 
  COUNT(Email) AS NumOccurrences
FROM EmployeeProfile
GROUP BY Email
HAVING ( COUNT(Email) > 1 )

May I know how can I list all field in the rows that contains Email appearing more than once in the table?

Upvotes: 5

Views: 2092

Answers (4)

Yogesh Sharma
Yogesh Sharma

Reputation: 50163

If EmployeeId is unique, then you can EXISTS :

SELECT ep.*
FROM EmployeeProfile ep
WHERE EXISTS (SELECT 1 
              FROM EmployeeProfile ep1 
              WHERE ep1.Email = ep.Email AND ep1.EmployeeId <> ep.EmployeeId
             );

Upvotes: 0

JohnLBevan
JohnLBevan

Reputation: 24410

SQL Fiddle

with cte as (
  select *
  , count(1) over (partition by email) noDuplicates
  from Demo
)
select *
from cte
where noDuplicates > 1
order by Email, EmployeeId

Explanation:

I've used a common table expression (cte) here; but you could equally use a subquery; it makes no difference.

This cte/subquery fetches every row, and includes a new field called noDuplicates which says how many records have that same email address (including the record itself; so noDuplicates=1 actually means there are no duplicates; whilst noDuplicates=2 means the record itself and 1 duplicate, or 2 records with this email address). This field is calculated using an aggregate function over a window. You can read up on window functions here: https://learn.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-sql?view=sql-server-2017

In out outer query we're then selecting only those records with noDuplicates greater than 1; i.e. where there are multiple records with the same mail address.

Finally I've sorted by Email and EmployeeId; so that duplicates are listed alongside one another, and are presented in the sequence in which they were (presumably) created; just to make whoever's then dealing with these results life easy.

Upvotes: 1

gotqn
gotqn

Reputation: 43626

Try this:

WITH DataSource AS
(
    SELECT *
          ,COUNT(*) OVER (PARTITION BY email) count_calc
    FROM EmployeeProfile
)
SELECT *
FROM DataSource
WHERE count_calc > 1

Upvotes: 5

holder
holder

Reputation: 585

select distinct * from EmployeeProfile where email in (SELECT 
Email
FROM EmployeeProfile
GROUP BY Email
HAVING COUNT(*) > 1 )

Upvotes: 1

Related Questions