Reputation: 1710

Select one set of duplicate data

Using SQL Server, how do I select only one "set" of relationships? To explain, here is an example. Also if you can help me figure out how to "say" this problem to make it more google-able, that would be stellar. For example, the single table contains two identical, inverted rows.

CREATE TABLE #A (
    EmployeeID INT,
    EmployeeName NVARCHAR(100),
    CoworkerID INT,
    CoworkerName NVARCHAR(100)
)

INSERT INTO #A VALUES (1, 'Alice', 2, 'Bob')
INSERT INTO #A VALUES (2, 'Bob', 1, 'Alice')
INSERT INTO #A VALUES (3, 'Charlie', 4, 'Dan')
INSERT INTO #A VALUES (4, 'Dan', 3, 'Charlie')

SELECT *
FROM #A

// THIS IS WHAT WE WANT TO PROGRAMATICALLY DETERMINE - which ID's to keep and which to delete.

DELETE FROM #A WHERE EmployeeID = 2
DELETE FROM #A WHERE EmployeeID = 4

SELECT *
FROM #A

So, the net result of the final query is:

|------------|---------------|-------------|--------------|
| EmployeeID | EmployeeName  | CoworkerID  | CoworkerName |
|------------|---------------|-------------|--------------|
|      1     |    Alice      |     2       |     Bob      |
|------------|---------------|-------------|--------------|
|      3     |    Charlie    |     4       |     Dan      |
|------------|---------------|-------------|--------------|

Upvotes: 0

Answers (3)

Randy in Marin

Reputation: 1143

This will only delete those with a circular reference. If you have more complex chains, this will not affect them.

DELETE a
FROM #A a
WHERE EXISTS (
    SELECT * 
    FROM #A t 
    WHERE t.CoworkerID = a.EmployeeID
        AND  t.EmployeeID < a.EmployeeID
    )

Upvotes: 1

Joe Taras

Reputation: 15399

Try this:

DELETE FROM #A 
WHERE #A.EmployeeID > #a.CoworkerID
AND EXISTS(SELECT 1 FROM #A A2
       where a2.CoworkerID = #A.EmployeeID
       and a2.employeeID = #a.coworkerID)

See SqlFiddle

Upvotes: 1

benjamin moskovits

Reputation: 5458

 WITH t2CTE AS  
    (  
       SELECT*, ROW_NUMBER() over (PARTITION BY employeename ORDER BY employeename) as counter1  
       FROM #a
    )  
    DELETE FROM t2CTE WHERE counter1 >1

Upvotes: -1

Select one set of duplicate data

Answers (3)

Related Questions