AspireMonkey
AspireMonkey

Reputation: 23

T-SQL, using Union to dedupe , but retain an additional column

I am using T-SQL to return records from the database (there are multiple criteria, but the list of unique ID's must be distinct), in short, the T-SQL looks like this:

SELECT
t1.ID,
[query1mark] = 1
WHERE criteria1 = 1

UNION

SELECT
t1.ID,
[query2mark] = 1
WHERE criteria2 = 1

I would like to be able to use Union to de-dupe on the ID field (the data has to be unique on the ID field), whilst retaining the derived column "query1mark" or "query2mark" to highlight which query it additionally came from. In my real world case, there are 5 queries that need to be de-duped against each other, so I need an efficient solution is possible.

EDIT: Additionally, the results from the first query need to be prioritised over those from the second query, and the results from the second query need to be prioritised over those from the third query, as I understand, this feature is inherent when using Union, as it will only add records from below the Union statement.

Is Union the best solution for this, and if not, what can I use?

Thanks

Upvotes: 0

Views: 125

Answers (2)

Kannan Kandasamy
Kannan Kandasamy

Reputation: 13959

you can use top 1 with ties

SELECT top 1 with ties * FROM yourtable
   ORDER BY ROW_NUMBER() OVER (PARTITION BY [ID] ORDER BY [criteria])

Upvotes: 0

gotqn
gotqn

Reputation: 43646

What about this:

DECLARE @DataSource TABLE
(
    [ID] INT
   ,[criteria] INT
);

INSERT INTO @DataSource ([ID], [criteria])
VALUES (1, 1)
      ,(1, 2)
      ,(2, 1)
      ,(3, 1)
      ,(3, 2)
      ,(4, 2);

WITH DataSource ([ID], [query_mark], [RowID]) AS
(
    SELECT *
          ,ROW_NUMBER() OVER (PARTITION BY [ID] ORDER BY [criteria] ASC)
    FROM @DataSource
)
SELECT [id], [query_mark]
FROM DataSource
WHERE [RowID] = 1;

The idea is to create sequence of all duplicated elements for particular group. The duplicates are order by the criteria field, but yo can change the logic if you need - for example to show the biggest criteria. The group is defined using the PARTITION BY [ID] statement, which means, order items for each [ID] group. Then, in the select, we only need to show one record per each group [RowID] = 1

Upvotes: 1

Related Questions