Reputation: 11

MS SQL Server select unique rows based on column value

I have an SQL select query that joins different tables. The result is something like the first table below

p_id | colAprimaryEQsecondary | p_cat | colA | colB | colC  
-----|------------------------|-------|------|------|------
001  |           1            |   1   |  x   | boo1 | flo1    
001  |           1            |   2   |  x   | boo1 | flo1    
002  |           0            |   1   |  a   | boo2 | flo2    
002  |           0            |   2   |  b   | boo2 | flo2    
003  |           1            |   1   |  z   | boo3 | flo3    
003  |           1            |   2   |  z   | boo3 | flo3

p_id represents the foreign key column of the database. Every cell value is the same within each row with corresponding p_ids (p_cat is supposed to be different), except sometimes column colA has 2 different values. Column colAprimaryEQsecondary contains a 0 or 1. 1 meaning the colA-value is the same in rows with corresponding p_id.

I want to get a result like this:

p_id | colAprimaryEQsecondary | p_cat | colA | colB | colC  
-----|------------------------|-------|------|------|------
001  |           1            |   2   |  x   | boo1 | flo1     
002  |           0            |   1   |  a   | boo2 | flo2    
002  |           0            |   2   |  b   | boo2 | flo2    
003  |           1            |   2   |  z   | boo3 | flo3

So I want to create an SQL query that looks at the value of colAprimaryEQsecondary. If the value equals 0 I want to get both rows, if the value equals 1 I only want to get the row with p_cat equals 2.

I tried doing this with different methods, group by, select distinct, select distinct on, case, but I couldn't get the right select query. (Unfortunately I have deleted the select statements I tried).

How would I solve this?

Upvotes: 1

Answers (3)

gotqn

Reputation: 43666

DECLARE @DataSource TABLE
(
    [p_id] CHAR(3)
   ,[colAprimaryEQsecondary] TINYINT
   ,[p_cat] TINYINT
   ,[colA] CHAR(1)
   ,[colB] VARCHAR(8)
   ,[colC] VARCHAR(8)
);

INSERT INTO @DataSource ([p_id], [colAprimaryEQsecondary], [p_cat], [colA], [colB], [colC])
VALUES ('001', '1', '1', 'x', 'boo1', 'flo1')
      ,('001', '1', '2', 'x', 'boo1', 'flo1')
      ,('002', '0', '1', 'a', 'boo2', 'flo2')
      ,('002', '0', '2', 'b', 'boo2', 'flo2')
      ,('003', '1', '1', 'z', 'boo3', 'flo3')
      ,('003', '1', '2', 'z', 'boo3', 'flo3');

SELECT * 
FROM @DataSource
WHERE [colAprimaryEQsecondary] = 0
UNION ALL
SELECT [p_id], [colAprimaryEQsecondary], MAX([p_cat]), [colA], [colB], [colC]
FROM @DataSource
WHERE [colAprimaryEQsecondary] = 1
GROUP BY [p_id], [colAprimaryEQsecondary], [colA], [colB], [colC];

Upvotes: 1

Gordon Linoff

Reputation: 1271161

You can do this with simple logic in the WHERE clause:

with q as (
      <your query here>
     )
select
from  q
where (colAprimaryEQsecondary = 0) or
      (colAprimaryEQsecondary = 1 and p_cat = 2)

Upvotes: 2

Gautam

Reputation: 85

Try a union of two different query one having a where clause with colAprimaryEQsecondary =0 and the other with colAprimaryEQsecondary=1 and for colAprimaryEQsecondary=1 do a count on colAprimaryEQsecondary.

Upvotes: 0

MS SQL Server select unique rows based on column value

Answers (3)

Related Questions