jbobbylopez
jbobbylopez

Reputation: 369

How do I write an SQL query to identify duplicate values in a specific field?

This is the table I'm working with:

The table

I would like to identify only the ReviewIDs that have duplicate deduction IDs for different parameters.

For example, in the image above, ReviewID 114 has two different parameter IDs, but both records have the same deduction ID.

For my purposes, this record (ReviewID 114) has an error. There should not be two or more unique parameter IDs that have the same deduction ID for a single ReviewID.

I would like write a query to identify these types of records, but my SQL skills aren't there yet. Help?

Thanks!

Update 1: I'm using TSQL (SQL Server 2008) if that helps
Update 2: The output that I'm looking for would be the same as the image above, minus any records that do not match the criteria I've described.

Cheers!

Upvotes: 6

Views: 28785

Answers (3)

Michael L.
Michael L.

Reputation: 618

If I understand the criteria: For each combination of ReviewID and deduction_id you can have only one parameter_id and you want a query that produces a result without the ReviewIDs that break those rules (rather than identifying those rows that do). This will do that:

;WITH review_errors AS (
  SELECT  ReviewID
  FROM test
  GROUP BY ReviewID,deduction_ID
  HAVING COUNT(DISTINCT parameter_id) > 1
)
SELECT t.*
FROM test t
  LEFT JOIN review_errors r
    ON t.ReviewID = r.ReviewID
WHERE r.ReviewID IS NULL

To explain: review_errors is a common table expression (think of it as a named sub-query that doesn't clutter up the main query). It selects the ReviewIDs that break the criteria. When you left join on it, it selects all rows from the left table regardless of whether they match the right table and only the rows from the right table that match the left table. Rows that do not match will have nulls in the columns for the right-hand table. By specifying WHERE r.ReviewID IS NULL you eliminate the rows from the left hand table that match the right hand table.

SQL Fiddle

Upvotes: 3

Ellesedil
Ellesedil

Reputation: 1626

SELECT * FROM table t1 INNER JOIN (
    SELECT review_id, deduction_id FROM table
    GROUP BY review_id, deduction_id
    HAVING COUNT(parameter_id) > 1
) t2 ON t1.review_id = t2.review_id AND t1.deduction_id = t2.deduction_id;

http://www.sqlfiddle.com/#!3/d858f/3

If it is possible to have exact duplicates and that is ok, you can modify the HAVING clause to COUNT(DISTINCT parameter_id).

Upvotes: 7

DeanOC
DeanOC

Reputation: 7262

Select ReviewID, deduction_ID from Table
Group By ReviewID, deduction_ID
Having count(ReviewID) > 1

http://www.sqlfiddle.com/#!3/6e113/3 has an example

Upvotes: 6

Related Questions