Heinrich
Heinrich

Reputation: 2234

SQL distinct on 2 columns

Is there a way to query distinct on the combination of two columns? Let's say that I have the table below where where the LinkedIds are foreign keys from the same table and the table below links the two together. What I need is a way to query the table to get distinct entries such that I get the results in figure 2 when querying figure 1.

It doesn't matter which row I would get back just that I get back one. Anyway to achieve this with plain SQL?

Figure 1.

+----+-------------------+-------------------+
| Id | Table1_LinkedId_1 | Table1_LinkedId_2 |
+----+-------------------+-------------------+
|  1 | 15                | 48                |
|  2 | 48                | 15                |
|  3 | 1                 | NULL              |
|  4 | NULL              | 5                 |
|  5 | 12                | 51                |
+----+-------------------+-------------------+

Figure 2.

+----+-------------------+-------------------+
| Id | Table1_LinkedId_1 | Table1_LinkedId_2 |
+----+-------------------+-------------------+
|  1 | 15                | 48                |
|  3 | 1                 | NULL              |
|  4 | NULL              | 5                 |
|  5 | 12                | 51                |
+----+-------------------+-------------------+

Upvotes: 0

Views: 79

Answers (3)

Jason A. Long
Jason A. Long

Reputation: 4442

This is an option...

IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL 
DROP TABLE #TestData;

CREATE TABLE #TestData (
    Id INT NOT NULL PRIMARY KEY,
    Link1 INT NULL,
    Link2 INT NULL 
    );
INSERT #TestData (Id, Link1, Link2) VALUES
    (1,  15  , 48  ), 
    (2,  48  , 15  ), 
    (3,  1   , NULL), 
    (4,  NULL, 5   ), 
    (5,  12  , 51  );

-- =================================================================

WITH 
    cte AS (
        SELECT 
            td.Id, td.Link1, td.Link2,
            RN = ROW_NUMBER() OVER (PARTITION BY IIF(td.Link1 <= td.Link2, td.Link1, td.Link2), IIF(td.Link1 > td.Link2, td.Link2, td.Link1) ORDER BY (SELECT NULL))
        FROM
            #TestData td
        )
SELECT 
    c.Id, c.Link1, c.Link2
FROM
    cte c
WHERE 
    c.RN = 1;

Results...

Id          Link1       Link2
----------- ----------- -----------
3           1           NULL
4           NULL        5
5           12          51
1           15          48

Upvotes: 2

jperelli
jperelli

Reputation: 7207

You could try something along this lines (not tested)

SELECT
  *
FROM
  (
    SELECT
      CASE
          WHEN Table1_LinkedId_1 >= Table1_LinkedId_2 THEN Table1_LinkedId_1
          ELSE Table1_LinkedId_2
      END AS Max,
      CASE
        WHEN Table1_LinkedId_1 <= Table1_LinkedId_2 THEN Table1_LinkedId_1
        ELSE Table1_LinkedId_2
      END AS Min
    FROM
      Figure1
  ) as Inner
GROUP BY
  Max,
  Min

Upvotes: 1

Ian Kenney
Ian Kenney

Reputation: 6446

You can create an ordered version of the table

select 
  id, 
  case when Table1_LinkedId_1 < Table1_LinkedId_2 then Table1_LinkedId_1 else Table1_LinkedId_2 end link1,
  case when Table1_LinkedId_1 < Table1_LinkedId_2 then Table1_LinkedId_2 else Table1_LinkedId_1 end link2
from links

then you can use this to dedupe

select 
  min(id) id,
  link1,
  link2
from 
(
      select 
        id, 
        case when Table1_LinkedId_1 < Table1_LinkedId_2 then Table1_LinkedId_1 else Table1_LinkedId_2 end link1,
        case when Table1_LinkedId_1 < Table1_LinkedId_2 then Table1_LinkedId_2 else Table1_LinkedId_1 end link2
      from links
) x
group by link1, link2

Upvotes: 2

Related Questions